Tuesday, June 24, 2008

hakia Adds 10 million PubMed Articles to its Semantic Search Engine

PubMed.gov is one of the largest data aggregation points in medicine, and the only one that covers more than 4000 journal entries. We are proud to announce that hakia has QDEXed more than 10 million PubMed abstracts, and is now offering PubMed search exclusively at pubmed.hakia.com, or at hakia.com as part of a general search.

You don’t know what you are missing using PubMed’s own Search Engine
We start with an interesting observation that PubMed’s own search engine has some serious holes in it, and the user may not realize what he/she is missing. Although we do not like the exercise of showing comparisons for example purposes, in this particular case, there seems to be no other way to demonstrate the alarming importance of searching efficiently for health information on the Internet.

The first query is a simple one: Protein C deficiency

As part of a general search, hakia’s first result from PubMed is an article written by Nizzi FA Jr, Kaplan HS., from the University of Texas Southwestern Medical Center. It is all about Protein C and S deficiency. PubMed’s own search engine not only fails to bring this article, but all 20 results are irrelevant to this query.

Protein C deficiency is not dull subject. It causes blood clots and should be on the radar of medical doctors, nurses, medical students, researchers, and even the standard health consumer.

Next query is a bit more research oriented: phosphorylation sites in glycine

The situation is the same. hakia’s first result from PubMed is an article written by Luca Z. et al, from Vanderbilt University School of Medicine. PubMed’s own search engine fails to bring this abstract and nothing seems to be related in the first 20 results.

Before making this blog post too repetitious, we will finalize it with a third example picked from dozens of other examples we have analyzed. This time, the query is a more generalized concept in genetics: modulation of ion channels.

hakia’s first result from PubMed is an article written by Dascal N. from Sackler School of Medicine, Tel Aviv University. PubMed’s own search engine fails to bring this abstract and the first 20 results are not promisingly relevant.

Google Site Search for PubMed shows the same holes

So, we turned to Google site search. The query protein C deficiency fails. So do the other two queries:
phosphorylation sites in glycine and modulation of ion channels. However, these failures are expected from Google due to (1) undefined coverage, and (2) limitations of the popularity algorithms.

Semantic search making a difference at the basic level

These arguments are to remind the readers of the fact that semantic search technology can make a difference at the basic level of retrieval because of its built-in consistency, and because the technology does not depend on any statistics. We have not even discussed the semantic variations between the queries and text.

hakia’s PubMed coverage will continue on a daily basis as we utilize the power of semantic algorithms handling dynamic data (new abstracts emerging daily.) Stay tuned for an update.

No comments: