| Peer-Reviewed

Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology

Received: 14 May 2017     Accepted: 1 June 2017     Published: 30 August 2017
Views:       Downloads:
Abstract

Recent advancements in communication technologies, on the one hand, and computer hardware and database technologies, on the other hand, have made it easy for organizations to collect, store and manipulate massive amounts of data. As the volume of data increases, the proportion of information in which people could understand decreases substantially. The applications of learning algorithms in knowledge discovery are promising and they are relevant area of research offering new possibilities and benefits in real-world applications such as blood bank data warehouse. The availability of optimal blood in blood banks is a critical and important aspect in a Blood transfusion service. Blood banks are typically based on a healthy person voluntarily donating blood used for transfusions. The ability to identify regular blood donors enables blood bank and voluntary organizations to plan systematically for organizing blood donation camps in an efficient manner. The objective of this study was to explore the immense applicability of data mining technology in the Ethiopian national blood bank service by developing a predictive model that could help in the donor recruitment strategies by identifying donors that are at risk of TTIs which can help in the collection of safe blood group which in turn assists in maintaining optimal blood. The analysis has been carried out on 14575 blood donor’s dataset that has at least one pathogen using the J48 decision tree and Naive bayes algorithm implemented in Weka. J48 decision tree algorithm with the overall model accuracy of 94% has offered interesting rules. From the total of 156729 consecutive blood donors, 14757 (9.41%) had serological evidence of infection with at least one pathogen and 29 (0.19%) had multiple infections. The overall seroprevalence of HIV, HBV and HCV was 2.29%, 5.23%, and 2.30% respectively. The seropositivity of TTIs was significant in business owners, students, civil servants, unemployed individuals, drivers and age groups 25 to 34 and 35 to 44 years.

Published in American Journal of Artificial Intelligence (Volume 1, Issue 1)
DOI 10.11648/j.ajai.20170101.16
Page(s) 44-55
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2017. Published by Science Publishing Group

Keywords

Data Mining, Blood Bank, HIV, HBC, HVC, CRISP-DM, Ethiopia

References
[1] ANAGAW S (2002). Application of data mining technology to predict child mortality patterns: the case of butajira rural health project (brhp). Unpublishd Masters thesis Addiss Ababa University.
[2] Bigus J. (1996). Data Mining with Neural Networks: Solving Business Problems- from Application Development to Decision Support. Mc Graw-Hill: New York.
[3] Butch S. H. (2002) Computerization in the transfusion service. Vox Sanguinis., 83 (suppl 1), 105-110.
[4] Dhingra N. (2016). Screening Donated Blood for Transfusion- Transmissible Infections: World Health Organization. Available at: http://www.who.int/bloodsafety/makingsafebloodavailableinafricastatement.pdf. Accessed August 2016.
[5] The Ethiopian Red Cross Society (2010). National Blood Bank Service Highlights Blood a Gift for Life.
[6] Shyamsundaram and Santhanam. T. (2010). Application of CART Algorithm in Blood Donors Classification PG and Research Department of Computer Science, DG Vaishnav College, Chennai-600106, Tamil Nadu, India.
[7] Belay T (2002). Seroprevalence of HIV, HBV, HCV and syphilis infections among blood donors at Gondar University Teaching Hospital, Northwest Ethiopia: declining trends over a period of five years. Unpublishd Masters thesis Addiss Ababa University.
[8] Baye Gelaw and Yohans Mengistu (2002).. The prevalence of HBV, HCV and malaria parasites among blood donors in Amhara and Tigray regional states.
[9] Tagny CT MD, Tapko JB, Lefrère JJ (2008). Blood safety in Sub-Saharan Africa: a multi-factorial problem. Transfusion 2008; 48 (6): 1256-1261.
[10] Blood Safety Indicators (2009). World Health Organization. Geneva.
[11] Deogan (2011). Data Mining: research Trends, Challenges, and Applications [database on the Internet]. [Accessed on February, 21, 2016].
[12] Piatetsky-Shapiro G. (2000) Knowledge Discovery in Databases: 10 Years After. SIGKDD Explorations. Online. Retrieved from http://www.kdnuggets.com/gpspubs/sigkdd-explorations-kdd-10-years.html. Accessed March 15, 2016
[13] Han Ja K, Micheline (2001). Data Mining: concepts and Techniques. San Fransisco; Morgan kufman Publishers.
[14] Last, Mark, Maimon, oded, and Kandel Abraham (2016). Knowledge Discovery in Mortality Records: Aninfo-fuzzy Approach. Retrieved from http://www. csee.usf.edu/ softec/med_ dm3.pdf. Accessed May 16, 2016.
[15] Fayyad U, Piatetsky-shapiro, G. and Smyth, Padharic (1996). From Data Mining to Knowledge Discovery in Databases.
[16] Helen T. (2003). Application of Data Mining Technology to Identify Significant Patterns in Census or Survey Data. Unpublished Masters Thesis Addis Ababa University, Addis Ababa.
[17] Tesfaye, Hintsay. (2002). Predictive Modeling Using Data Mining Techniques In Support to Insurance Risk Assessment.
[18] Cabena P. Discovering (19980. Data Mining - From Concept to Implementation, Prentice Hall, New Jersey.
[19] Thearling K. (2003). An introduction to Data Mining. Retrieved from http://www3.shore.net/~kht/text/dmwhite.pdf. Accessed March 18, 2016.
[20] Chapman P. (1999). CRISP-DM 1.0 Step-by-step Data mining Guide SPSS Inc., U.S.A CRISPWP-0800.
[21] Berry Mal, G. (1997). Data Mining Techniques: For Marketing, Sales and Customer Support. New York. John Wiley and Sons, Inc.
[22] Levin Na Z, Jacob, (1999). Data Mining. Available Retrieved from http:www.urbanscience.com/Data Mining.pdf
[23] Witten Ihaf, Eibe (2000). Practical Machine Learning Tools and Techniques with Java Implementations. USA: Academic Press.
Cite This Article
  • APA Style

    Haftom Gebregziabher, Million Meshasha, Patrick Cerna. (2017). Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology. American Journal of Artificial Intelligence, 1(1), 44-55. https://doi.org/10.11648/j.ajai.20170101.16

    Copy | Download

    ACS Style

    Haftom Gebregziabher; Million Meshasha; Patrick Cerna. Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology. Am. J. Artif. Intell. 2017, 1(1), 44-55. doi: 10.11648/j.ajai.20170101.16

    Copy | Download

    AMA Style

    Haftom Gebregziabher, Million Meshasha, Patrick Cerna. Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology. Am J Artif Intell. 2017;1(1):44-55. doi: 10.11648/j.ajai.20170101.16

    Copy | Download

  • @article{10.11648/j.ajai.20170101.16,
      author = {Haftom Gebregziabher and Million Meshasha and Patrick Cerna},
      title = {Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology},
      journal = {American Journal of Artificial Intelligence},
      volume = {1},
      number = {1},
      pages = {44-55},
      doi = {10.11648/j.ajai.20170101.16},
      url = {https://doi.org/10.11648/j.ajai.20170101.16},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20170101.16},
      abstract = {Recent advancements in communication technologies, on the one hand, and computer hardware and database technologies, on the other hand, have made it easy for organizations to collect, store and manipulate massive amounts of data. As the volume of data increases, the proportion of information in which people could understand decreases substantially. The applications of learning algorithms in knowledge discovery are promising and they are relevant area of research offering new possibilities and benefits in real-world applications such as blood bank data warehouse. The availability of optimal blood in blood banks is a critical and important aspect in a Blood transfusion service. Blood banks are typically based on a healthy person voluntarily donating blood used for transfusions. The ability to identify regular blood donors enables blood bank and voluntary organizations to plan systematically for organizing blood donation camps in an efficient manner. The objective of this study was to explore the immense applicability of data mining technology in the Ethiopian national blood bank service by developing a predictive model that could help in the donor recruitment strategies by identifying donors that are at risk of TTIs which can help in the collection of safe blood group which in turn assists in maintaining optimal blood. The analysis has been carried out on 14575 blood donor’s dataset that has at least one pathogen using the J48 decision tree and Naive bayes algorithm implemented in Weka. J48 decision tree algorithm with the overall model accuracy of 94% has offered interesting rules. From the total of 156729 consecutive blood donors, 14757 (9.41%) had serological evidence of infection with at least one pathogen and 29 (0.19%) had multiple infections. The overall seroprevalence of HIV, HBV and HCV was 2.29%, 5.23%, and 2.30% respectively. The seropositivity of TTIs was significant in business owners, students, civil servants, unemployed individuals, drivers and age groups 25 to 34 and 35 to 44 years.},
     year = {2017}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology
    AU  - Haftom Gebregziabher
    AU  - Million Meshasha
    AU  - Patrick Cerna
    Y1  - 2017/08/30
    PY  - 2017
    N1  - https://doi.org/10.11648/j.ajai.20170101.16
    DO  - 10.11648/j.ajai.20170101.16
    T2  - American Journal of Artificial Intelligence
    JF  - American Journal of Artificial Intelligence
    JO  - American Journal of Artificial Intelligence
    SP  - 44
    EP  - 55
    PB  - Science Publishing Group
    SN  - 2639-9733
    UR  - https://doi.org/10.11648/j.ajai.20170101.16
    AB  - Recent advancements in communication technologies, on the one hand, and computer hardware and database technologies, on the other hand, have made it easy for organizations to collect, store and manipulate massive amounts of data. As the volume of data increases, the proportion of information in which people could understand decreases substantially. The applications of learning algorithms in knowledge discovery are promising and they are relevant area of research offering new possibilities and benefits in real-world applications such as blood bank data warehouse. The availability of optimal blood in blood banks is a critical and important aspect in a Blood transfusion service. Blood banks are typically based on a healthy person voluntarily donating blood used for transfusions. The ability to identify regular blood donors enables blood bank and voluntary organizations to plan systematically for organizing blood donation camps in an efficient manner. The objective of this study was to explore the immense applicability of data mining technology in the Ethiopian national blood bank service by developing a predictive model that could help in the donor recruitment strategies by identifying donors that are at risk of TTIs which can help in the collection of safe blood group which in turn assists in maintaining optimal blood. The analysis has been carried out on 14575 blood donor’s dataset that has at least one pathogen using the J48 decision tree and Naive bayes algorithm implemented in Weka. J48 decision tree algorithm with the overall model accuracy of 94% has offered interesting rules. From the total of 156729 consecutive blood donors, 14757 (9.41%) had serological evidence of infection with at least one pathogen and 29 (0.19%) had multiple infections. The overall seroprevalence of HIV, HBV and HCV was 2.29%, 5.23%, and 2.30% respectively. The seropositivity of TTIs was significant in business owners, students, civil servants, unemployed individuals, drivers and age groups 25 to 34 and 35 to 44 years.
    VL  - 1
    IS  - 1
    ER  - 

    Copy | Download

Author Information
  • Department of Information Technology, Federal TVET Institute, Addis, Ethiopia

  • Department of Information Science, Addis Ababa University, Addis, Ethiopia

  • Department of Information Technology, Federal TVET Institute, Addis, Ethiopia

  • Sections