Laura O'Grady; PhD

May 13,2012

17:58

There are 5615 journals currently indexed in PubMed. I was curious to know which of these journals is publishing articles on eHealth. I searched the Medical Subject Headings (MeSH) using the word: “eHealth” and found three entry terms: eHealth, Mobile Health and Telehealth. I adapted a script and ran individual searches on the years 2010, 2011 and another search that included records from 1977 to the present. This resulted in 4908 articles. The findings are graphed in Figure 1.

Journals indexed in PubMed using MeSH term “eHealth”

Figure 1: Journals indexed in PubMed using MeSH term “eHealth”

(If you wish to know the full name of the journal you can look it up in the National Library of Medicine LocatorPlus).

A wide variety of journals contained articles that were indexed using these terms, including some of which are written in languages other than English (Sov zdravoohr  is in Russian and Lakartidningen is in Swedish). One note of interest is the change in the abbreviation for the British Medical Journal from Br Med J to BMJ, which resulted in it being listed twice. I was somewhat shocked that neither Informatics for Health and Social Care (formerly known as Medical Informatics and the Internet in Medicine) nor BMC Medical Informatics and Decision Making appears on the list given that their content includes publications in eHealth. It may be that these articles are indexed only with the term “informatics”, which is listed in MeSH.

I conducted the search again using the MeSH terms for “Social Media” (Social Media, Social Medium and Web 2.0). This resulted in 721 hits, which is not surprising given that these terms are relatively new in comparison to those associated with eHealth. Figure 2 illustrates the findings.

Journals indexed in PubMed using MeSH term “Social Media”Figure 2: Journals indexed in PubMed using MeSH term “Social Media”

When you submit an article for publication to a journal you are often asked to supply keywords that describe the content of your paper. In some cases you are explicitly asked to use MeSH terms. In cases where an article does not have MeSH terms they are indexed by staff at PubMed. In either case there may be publications that include material on eHealth or social media that are not being labeled as such.

There are six journals (Journal of Medical Internet Research, Studies in Health Technology and Informatics, Conference proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, AMIA Annual Symposium Proceedings, Caring and the British Medical Journal) that are found on both lists. Either the authors of these articles, the journals that publish these papers or PubMed are ensuring these terms are being used.

Those who conduct searches of PubMed using these MeSH terms may be missing important publications in these fields. It is likely the author(s) of the papers who understand its content with the precision required to ensure proper indexing. We all want our publications to reach the right audience. Therefore as authors need to be aware of the MeSH terms, how they are used and be more consistent in indexing our papers.

April 25,2012

16:23

Introduction

Many of us are familiar with Charles Minard’s map of Napoleon’s March to Moscow in 1812 (Figure 1). This map has been reproduced in various publications including Edward Tufte’s “The Visual Display of Quantitative Information”. As Tufte noted the map provides us with various pieces of data: the width of the brown line indicates the size of the army as it travels east and the black line as it retreats in a westerly direction. The dates and temperature that correspond with progress of the march are included. Also provided are the longitude and latitude, which situate the location within the larger geographical context. This map, concise in format and useful in providing various forms of data, has been referred to as the “best visualization ever made”.  I believe this conclusion should be reconsidered.

Napoleon’s March to Moscow

Figure 1: Napoleon’s March to Moscow

With recent technological advances such as Google Map and application programming interfaces (API) we now have access to web-based tools that allow this map to be re-created in an interactive format. Figure 2 shows this map in a “Hybrid” view, which combines the Map and Terrain options.

Flow Map of Napoleon’s March to Moscow

Figure 2: Flow Map of Napoleon’s March to Moscow

Source: http://hci.stanford.edu/jheer/files/zoo/ex/maps/napoleon.html

The tools (located in the upper left corner of Figure 2) can be used to zoom in on specific areas of the map for greater detail or zoom out to situate it relative to other parts of Europe. However, even with this rendition we still do not know the complete story of what transpired on the march. For example, no explanation is available for why the army diminished in size. We can postulate they engaged in battle, fell victim to disease, adverse weather conditions or perhaps starvation. History texts or other sources such as narratives by the soldiers would provide clarity. The map (Figure 1) was created by Minard fifty seven years after the march took place. How might Napoleon have altered his strategy if simple details such as terrain or weather had been provided?

Conveying meaning through images

Below is an artists’ impression of troops traveling during the March to Moscow. What other kinds of information we can infer from this depiction? For example, we may sense that the troops are under dressed for the cold weather. Their hunched over appearance may imply that they are overwhelmed with the weight of the packs or avoiding freezing winds that are hitting their faces. Without any clear indication that they are carrying weapons beyond sticks or pitchforks they do not appear to be well prepared for battle. But perhaps the most telling element conveyed in this pictorial is emotion. It looks like they are suffering.

The men of Napoleon’s March to Moscow

Image I: The men of Napoleon’s March to Moscow

Source: http://www.rideandseek.com/expedition/napoleon/overview

Communication through narrative

The following quote is taken from the same web site as Image I. The weather has changed and the soldiers are now experiencing a much warmer climate.

 By this stage of Napoleon’s invasion it was the middle of July and it was the heat rather than the cold that was becoming a major obstacle.  One veteran described the hot conditions as, “worse than anything we’d known in Egypt”. Men died of heatstroke and dysentery at such a rate that the army had been reduced in size by a third when it reached Vitebsk. The remaining men were at the end of their endurance and they hadn’t even fought a single battle! Many of them had been on the march for three months, all the way from Paris with only two days’ rest. Others had endured a forced march for 32 hours covering a daunting 170km!

Source: http://www.rideandseek.com/expedition/napoleon/overview

What other types of information can we obtain from this narrative? We now know it is possible that the troops were suffering from illness, some of them died not from combat but from disease and that the heat is also causing fatalities.

Conclusion

Many of us who work as social scientists were trained within one discipline. If it was psychology chances are you were schooled in quantitative research. Sociologists are more likely to have taken coursework in qualitative research. There are exceptions but expertise generally comes with a price. You often have to “pick a camp” and use the method associated with your field. There have been great advances in mixed methods, which is a relatively new field that combines the strengths of both of these methods. A few text books have been written on this concept. However, few universities offer mixed methods as a course because those in a position to teach have not been trained in both methods. There have also been efforts to create training programs that promote interdisciplinary collaboration. However, it is not known what happens to graduates of these programs when they return to their “home base”. How many carry on the effort of collecting both quantitative and qualitative data? We desperately need both and to find ways to incorporate them in a meaningful way.

April 2,2012

17:46

In a previous post I used Google Charts to explore how data from a web-based source (Statistics Canada) can be mined and displayed in format that provided us with some insights. The data visualization (in the form of bar charts) demonstrated that rates of diabetes are increasing and more so in certain geographical areas of Canada. To help reduce these rates we need to further elucidate causative factors if and where possible.

Some individuals become diabetic because they are unaware they are at risk. Many not be consciously aware of why they do not engage in behaviour change(s) despite known risks. Or they do not know that by changing their diet or exercising this risk can be reduced. In some circumstances individuals may be unable to engage in lifestyle changes. For example, in some geographically remote locations access to fresh fruits and vegetables may be limited, particularly in the off season. In suburban settings there may be more reliance on using transportation rather than walking. From a visualization perspective the nature and extent in which these issues play a causative role could be explored by overlaying data sets of these variables within a GIS (geographic information system) application. Unfortunately data of this nature are currently not available in Canada. In addition, messaging can be inconsistent. We also lack detailed knowledge in our understanding of the ways in which family physicians and various media are used to inform the public of diabetes risk factors. Is the messaging consistent and effective? How can we move forward on prevention issues without understanding all the variables involved in relation to increasing diabetic rates?

In a recent study my colleagues and I explored the ways in which people with diabetes used the Internet, in particular a web-based message forum to tag or label posts as well as search for credible content using this tagging format. The study involved usability testing, interviews as well as surveys. There may be some clues in examining the dialogue that stemmed from these interviews.

In the coding of the interviews one concept predominantly mentioned was the notion of anecdotal or experiential information. In this context an anecdotal source means information that is learned informally from others with diabetes. One common phrase to describe this concept is “tricks of the trade”.  Although I did not have the type of data I had first thought about (and outlined above) I had some qualitative findings about the ways in which people with diabetes view one aspect, anecdotal information, in relation to living with this disease.  But I was also curious about the rhetoric around how providers discuss the treatment of diabetes. Since health providers were not part of this study I decided to examine the, “Canadian Diabetes Association 2008 Clinical Practice Guidelines for the Prevention and Management of Diabetes in Canada”.  I also wanted to use information visualization techniques to explore this material. I formalized my research objective for this investigation as, “In what ways do patients and providers dialogue about the care and management of diabetes”?

In the follow sections I will provide details about the method, findings, discussion, limitations and some final thoughts.

Method

This purpose of this study was intended to explore, not explain two sources of information (one obtained in a research study with patient participants and another using written guidelines intended for health professionals to treat patients with diabetes) using a visualization technique. One tool that readily provides a visual representation of written content is a word cloud, which can be created using a web-based tool at Wordle.

The written content from the patients living with diabetes was obtained by using quotes from an interview study that had been coded as “anecdotal”. It was copied and pasted into the word cloud utility at Wordle. The written material from the provider perspective was obtained by using the “Canadian Diabetes Association 2008 Clinical Practice Guidelines for the Prevention and Management of Diabetes in Canada”. The specific sections that were used included the material on “Insulin Therapy in Type 1 Diabetes” and “Pharmacologic Management of Type 2 Diabetes”. This material was also copied and pasted into the Wordle word cloud application. Both samples were rendered using the same font (“Lucida Sans”), colour scheme (“Ghostly”) and layout (“mostly horizontal”) format to facilitate comparison.

Findings

According to the word count feature at Wordle the three most frequent words or phrases from the patient participant anecdotal quotes (Image I) were “people” (eight times), “know” and “like”, which appeared seven times.  The words “the”, “I”, “to”, “and”, “that”, “of”, “in” and “what” were excluded.  No medical terms were mentioned (this could be because the topic matter for these quotes was anecdotal information). The word “individual” and “diabetic” were each used once.

Anecdotal word cloud

 

Image I: Word cloud from patient participant anecdotal quotes

In the provider guidelines as shown in Image II (with the words “should”, “be”, “to”, “with” and “of” were removed) the top three  words were “insulin”, “antihyperglycemic” and “agents”, which were used three as were “regiments” and “lifestyle”.  The most common phrases from the patient participant anecdotal quotes (“people”, “know” and “like”) were not represented in the sections of the guidelines used to render that word cloud Image II).

Canadian Diabetes Association 2008 Clinical Practice Guidelines

 

 Image II: Word cloud from Canadian Diabetic Association Guidelines

Discussion

Not surprisingly patients and providers were seemingly focused on different issues. Or it could be that they used different language to articulate these issues. My first thought in seeing the frequency of the word “people” in the patient participant anecdotal quotes was to compare it to “individuals” in the provider guideline sample, which was very small in the latter cloud. Is it possible that in this context that the patients mean “community” as a group of people with diabetes in this context and the providers are referring to who they treat?

Limitations

The quotes used to create the word cloud represented material specifically about anecdotal information. This section was chosen because it was the most frequently identified content in the qualitative interviews. In addition the participants in the study were not representative of people with diabetes across Canada. The goal of the study was not related to diabetes prevention. However, as with the nature of the open format of qualitative interviews many participants shared information beyond initial intent of the inquiry.

Final Thoughts

Wordle is a tool that helped me visual something I probably would not have noticed otherwise. Although it does not offer an explanation (the context of these terms needs to be explored) it did allow me to explore the content in a new way. It is a very simplistic means of interpretation (word count) but I’m excited about the possibilities that this and other types of information visualization can bring to aid exploring qualitative research.  On one level this very (very!) exploratory examination could be an indication of a very deeper problem: the issues that providers value and consider important to include in treatment guidelines may be quite different from those in which patients’ value. How do we get patients and providers on the same page?  Social media may be one way of closing the gap. Electronic health records that provide a space for patients and providers to dialogue may be another means. Either way each side needs to be aware of the differences and acknowledge that moving towards a shared repertoire through mutual engagement to negotiate new meaning is imperative to help reduce increasing rates of diabetes.

March 19,2012

11:09

Like most people I have a hectic schedule and a lot of information to process. As an interdisciplinary researcher my interests include but are not limited to: health care, technology and emerging methodologies to measure effectiveness. To curate and parse this material I use a variety of web-based sources.  Some of the content comes from automated searches of peer reviewed journal articles, email subscriptions of new issues of journal articles, RSS feeds, app-based tools such as Zite and also Twitter. Perhaps the most difficult to maintain the best signal to noise ratio is the latter. I have learned to quickly scan the table of contents of journals and article titles in blog postings or other online media. By using Google as my RSS reader I am provided with useful statistics that include data on which feeds I click on, save and email to myself. This has allowed me to drop feeds I was not using but was not aware this was the case.

Over the past few years I have examined many tools designed to search, analysis and measure the use of Twitter such as Social Bro, Social Report, Tweetgrader, Tweetstats, Tweetreach, Klout, Peer Review and many others. I have also conducted two in-depth reviews of these tools: Twitsprout and ThinkUp. My purpose for examining these tools was to determine better and faster ways to find relevant information from Twitter when I need it. I would like a centralized means to make decisions about how to receive and manage my Twitter feed.

I have compiled the following list of features that I feel would make Twitter a better tool:

  1. Whose tweets do I re-tweet the most?
  2. Whose tweets do I email to myself the most?
  3. From which of those that I follow whose tweets contain links that I click on the most? Which of these do I then save for closer reading and/or archive?
  4. Who am I most likely to respond to with a question, answer or request for clarification?
  5.  When does someone stop following me and what were the tweets I sent around this time period (e.g. in the 24/48 hours previously)?
  6.  How far do my tweets reach? (e.g. tweetreach.com)
  7.  A useful statistic from Social Network Analysis – providing a rank of betweenness centrality and mechanism from within the client to remove those I am following with the lowest ranking
  8.  A way to indicate which people that are following me who are tweeting  essentially the same content (as evident by the link or even analysis of the content – this should be relatively easy to conduct as it is only 140 characters)
  9. How many tweets do I get every day? From whom do I get the most/least tweets from? How can I sort my tweets based on user?
  10. Timestamps of when the tweet was sent. Using references like one hour or three hours or yesterday is not a very useful method if it requires you to think (my calendar or other measures of time do not use this system).

Please feel free to add to this wish list in the comment section below and circulate it to anyone you know who is designing or developing these applications.

March 5,2012

0:14

I have wanted to explore data using some of the advanced charting tools that are now available on the Internet for some time now. I’ve looked at quite a few options including Tableau and some GIS (Geographic Information System) programs such as ArcGIS and Instant Atlas.  Most of these were expensive or used complex and cumbersome interfaces that had a steep learning curve.

When I first looked at the Google Data Public Explorer I was impressed by the “higher order” use of CVS files to store the data, the XML to parse it and its use of HTML to render the output. The use of CVS files saves a lot of time when it comes to entering new data as you only need to change the file when you have new figures to add. However the need to use separate sets of variables is counter-intuitive to my training in statistics, where all the data resides in one file. Although the use of motion charts and their ability to convey meaning by displaying change over time was of great interest to me in the end my real world data set (diabetic rates from Statistics Canada) was limited and did not did not lend itself to this display in this format.  Instead I used Google Charts, which allows you to code characteristics of your chart as well as enter the data directly in the HTML file. Much simpler in the short term. But this tool is may not be a good choice in the long run as it means editing the code rather than calling up new CVS data files as utilized by the Data Public Explorer.

By using the basic bar chart option I was able to plot the overall rate of diabetes in nine provinces for five years (2005, 2007, 2008, 2009 and 2010). According the data source information for some provinces was considered not reliable enough to be published.  There was no explanation provided as for why data was not available in the year 2006.

Image 1 (below) shows these results in a static bar chart format

 

Image 1: Canadian diabetes rates (source: Statistics Canada)

(I also learned that it is not easy to display a Google Chart in WordPress. To see the interactive elements of the chart click here for the dynamic version.)

The chart shows that rates of diabetes are higher on the east coast and that they have been increasing in past years (although there is a decrease in 2010). According to the Public Health Agency of Canada’s 2011 report on diabetes there are a number of associated risks including genetics, being overweight, not exercising, certain ethnic origins and increasing age. These tend to explain the “how” but not the “why”. For example, why does someone who knows they are at risk genetically for diabetes also not exercise or eat properly, which increases their risk? Changing these types of behaviours requires a deeper understanding of why they are taking place.

In the next installment I will explore how we can use other sources of information to help contextualize the numbers that were represented in the bar chart, which may lead to a better understand of ways in which behaviours can be changed.

Note: April 3rd – The follow-up post, “Patients and providers: identifying a diabetes dialogue gap?” is now available.

February 13,2012

18:32

It has been said that if you didn’t pay for the product then you are the product.

Nowhere is this more evident than with the plethora of social media analytics applications currently available on the Internet.  For example, there are dozens of applications that provide feedback on your use of Twitter (see in particular the list under, “Twitter Account Analysis tools”).  In the former era where web sites such as Tucows acted as a repository of web-based applications where you could download a utility (e.g. an FTP or telnet application) and use it without concern that data on your usage was being collected and sold. I do not claim to know the financing structure of the companies that provide these newer applications but I do know they must have some paid employees in order to function.

The field of analytics has grown from log file analysis that track movements online to consumer-oriented web surveys to include data extraction from message forums, Twitter, blogs, wikis and using purchasing patterns in the form of recommender systems. The social analytics application ThinkUp, which refers itself a Social Media Insights Platform (http://thinkupapp.com/) is one if the first (if not only) open source tool to help evaluate use of various applications such as Facebook, Twitter and Google+. Recently out of the beta this tool is installed and maintained by the end user. Therefore you have complete control over who can access data from your social media usage.

I have ThinkUp configured to display tweets from one account as well as geoencode by plotting their location on Google Maps. The main page provides links for following sections: “Dashboard”, “Tweets”, “Followers”, “Who you follow” and “Links” (Image 1). Under Dashboard you will find “Hot Posts”, “Recent Activity”, “Followers by day”, “Followers by week”, “This week’s most re-tweeted ”,“ Post types” and “Client Usage for all your posts”.

Menu options for ThinkUp

 Image 1: Twitter Options in ThinkUp

 I have Twitter configure to email me when there is a re-tweet or reply or when someone new follows me on my account when so I do not find many of these metrics very useful. I can easily keep track of my account activity (four to six posts per day with an average of one re-tweet and about five hundred followers) in regard to this metric. Others with more active accounts may find this feature invaluable.

Recent activity does not provide the date (Image 2).

Image 2: “Recent Activity”

The “Post types” (Image 3) categorizes my tweets as being 87% broadcaster (defined as “post contain links”) and 3% conversationalist (posts are replies). I believe this is an accurate representation of my tweets. However, since this only adds up to 90% I am left wondering as to how the other 10% tweets would be categorized.

Type of PostsImage 3: “Post Types”

For those who have more than two categories it might be easier to read if a dashboard colour scheme were employed. Using the same shade of blue for every bar in the chart makes it even harder to distinguish the values.

There are similar colour issues with the pie chart used to display the “Client Usage” (Image 4).

Pie chart showing Twitter applications usedImage 4: “Client Usage (all posts)”

To be more cognitively efficient bar charts should be used instead of pie graphs as bar lengths are easier to compare than the angles that are intended to be representational in the pie. In addition the pie chart graph inadvertently cuts of some client application names I have used to post tweets. This is a minor problem as in this case I am able to determine which is being referred to since most of the letters in the name do appear. This may not be true for others. I do not find this information of much use but for those who share a Twitter account (e.g. one that is used by more than one person at a company or organization) may find this information to be of value.

The information provided under “Tweets” include: “Your Tweets”, “Tweets to You”, “Most Replied-To All Time”, “Most Retweeted All Time”, “Favorites” and “Inquiries”. The categories of “Your Tweets”, “Tweets to You” and “Favorites” are already provided by many Twitter applications. The “Most Replied-To All Time” and “Most Retweeted All Time” could be of value but do not include the date. Instead the number of days or months since the post was made is provided so you are required to count back and guess the actual day the tweet was made.  Knowing the exact date could provide a context helpful in explaining why this occurred. “Inquires” is merely a list of tweets that end in a question mark. This is deceptive because not all of these tweets represent a question. They could be the title of a post or an article that happens to include a question mark. In other words I am not asking a question.

On the page under “Followers” the “All-Time Most Discerning Followers” and “Most Popular Followers” are provided.  This information would be of more value if I were able to discern if any of these accounts were re-tweeting my posts or clicking on my links. The “Follower count by day”, “week” and “month” is provided. See Image 5 for an example of “Follower count by week”. Also included is the list membership by day, week and month. This information is also of little use without understanding the context by which increase or decreases occurred. Did I suddenly obtain or lose a large number of followers based on a certain tweet or set of tweets? Providing the ability to cross tabulate the “followers” by the “tweet” may lead to some insights in this area. The ability to view the tweets and conduct text analysis would make this a much more powerful tool.

The follower count by weekImage 5: “Follower count by week”

The section “Who you follow” includes “Chatterboxes”. Presumably those who tweet a lot, but to what extent is not provided. This could be context dependent – it may represent Twitter accounts that post 100 or 10 tweets per day.  Is this calculated in relation to “Deadbeats” (who are presumably those who rarely post) and “Popular” (those who have many followers)? I do not need to know if someone is followed by a lot of people. I need to know if the content they post is relevant to my needs. Providing me with information on how many times I’ve clicked or emailed a tweet in a topic area is of much more value to me. It provides me with information that I need to know – relevance.

Under the final section, the “Links” you will find “Links by favorites”, “Links by friends” and “Photos by friends”. Neither of the first two provides me with any “actionable” information. I can find the links in my favorites by using the Twitter client. Photos by friends could be useful if you forgot to save something when it was first tweeted but you have no control over how far back ThinkUp renders this data.

I could not get the geoencoding function to work on my installation.

Overall I would say that this application has great promise. It is only recently out of beta and has a lot of potential. I appreciate very much that it is open source and hope the programming community continues to find time to contribute to its development. However, as a social scientist it does not yet provide me with the kind of information I would like to see in an analytics application.

Follow Us: