× Home Services Contact Us

MCIT Announces the Participation of 4 Data Providers in Arabic GenAI Project (Fanar)

Thursday, May 23, 2024
  • MCIT Announces the Participation of 4 Data Providers in Arabic GenAI Project (Fanar)
  • Awqaf Ministry, QNL, Al Jazeera, and ACRPS are contributing data to the Arab Model for GenAI (Fanar) project.
  • Contributors are building a significant database of Arabic content for AI solutions in machine translation, voice recognition, and natural language processing.
  • HE the Minister of Communications and Information Technology emphasized the positive impact of this cooperation on the Arabic AI.
  • The data will enhance Fanar project’s ability to generate accurate Arabic content.

Today, the Artificial Intelligence Committee at the Ministry of Communications and Information Technology (MCIT) announced the participation of four prominent institutions in the Arabic Generative Artificial Intelligence (GenAI) project”Fanar”. This announcement was made during a meeting that included representatives from the government and academic sectors. The participating institutions are the Ministry of Endowments and Islamic Affairs (Awqaf), Qatar National Library (QNL), Al Jazeera, and the Arab Center for Research and Policy Studies (ACRPS). It’s worth mentioning that Fanar project was launched through an announcement by His Excellency Sheikh Mohammed bin Abdulrahman bin Jassim Al Thani, Prime Minister and Minister of Foreign Affairs, at the opening of the 4th edition of Qatar Economic Forum.

This collaboration is part of ongoing efforts by MCIT and its partners to provide Fanar with high-quality data and achieve a target of 300 billion words for the Arabic GenAI model, Fanar.

The data providers will contribute to building an extensive database of Arabic content, which will serve as a valuable resource for AI solutions such as machine translation, voice recognition, and natural language processing in Arabic. These new data sources will enhance the development of a more accurate and efficient linguistic model for Fanar, thereby improving its overall performance and understanding of the Arabic language.

The involvement of these prestigious institutions marks a significant step in the development of Fanar project, aligning with its goals to empower the performance of Arabic GenAI, strengthen the Arabic position on the global stage, and broaden its applications.

Commenting on this cooperation, His Excellency Mr. Mohammed bin Ali Al Mannai, Minister of Communications and Information Technology, said, “We are pleased with the participation of data provider institutions in the Arabic GenAI project (Fanar). This participation is a significant step in developing the project, as these entities possess rich and diverse historical, linguistic, cultural, and technical resources of Arabic content.”

His Excellency added, “We believe that this cooperation will have a positive impact on the Arabic Large Language Models (LLMs), and we look forward to having more local and international parties to join as data sources for Fanar project. This will support our efforts to preserve our authentic Arabic language and its essence. As the diversity of data sources increases, the project’s richness and ability to interact with Arabic content will become more accurate and comprehensive.”

Ministry of Endowments and Islamic Affairs (Awqaf) will contribute to the project by providing religious and cultural content through the linguistic resources available on its Islamic Network website, “Islam Web.” This includes a vast array of Islamic sciences, legal fatwas, and advisory services in medicine, culture, and family matters, making it the largest repository of Islamic and cultural content globally. Additionally, the site houses an extensive collection of knowledge across various sciences and arts. It has approximately 263,000 fatwas, over 148,000 transcribed video and audio lectures, more than 200,000 consultations, and over 70,000 cultural articles.

The Ministry mentioned that its participation in Fanar comes as part of its commitment to enhancing cooperation and partnership with the state institutions and its mission to serve the Qatari community and the global audience. Awqaf aims to strengthen the role of Islam in promoting human values and high moral principles. It also seeks to teach and disseminate the Arabic language, one of the most historically and culturally rich languages in the world. Therefore, its participation in Fanar aligns with its core missions and values.

Dr. Nasser bin Salem Al Ansari, Director of IT Operations and Infrastructure at QNL, praised MCIT’s efforts in promoting the Arabic language in AI technologies. He stated: “We value the efforts of MCIT in enhancing the role of the Arabic language and developing its capabilities and influence in AI technologies. We are pleased to be partners in this national project.”

Mr. Al Ansari also mentioned that this collaboration aligns with the library’s mission to preserve the heritage of the country and the region and underscores QNL’s commitment to highlighting the richness of the Arabic language through various initiatives and activities.

He further elaborated on the library’s significant strides in digitizing Arabic texts, “The library has come a long way in digitizing Arabic texts with high accuracy, and we have a substantial amount of digital Arabic content. As part of our participation in Fanar project, we are pleased to contribute our technical capabilities in the field of digitization, as well as the digital and digitized content we possess, including Arabic publications, documents, manuscripts, magazines, newspapers, and books. This will help train the project’s models on a larger and more diverse dataset.”

He finally mentioned that QNL is pleased to cooperate with prestigious authorities in this pioneering project that will enhance the AI capabilities in understanding and processing the Arabic language with high efficiency and accuracy.

Al Jazeera Network possesses an extensive archive of broadcasting hours and visual content, along with a vast collection of articles on its website, establishing itself as one of the largest and most renowned Arabic media networks globally. Comprising television channels, digital platforms, a research center, and a media institute, this infrastructure facilitates the project’s access to a substantial volume of audiovisual content and media engagements in Arabic.

Ahmed Al-Fahad, the Executive Director of Technology & Network Operations at Al Jazeera Media Network, commented, “This endeavor marks the culmination of Al Jazeera’s persistent efforts to leverage artificial intelligence, striving for excellence in media consistent with our Arab Islamic values and authentic culture. As a trailblazer in advanced technology and AI applications, this initiative plays a pivotal role in enhancing and preserving the distinctive linguistic and cultural attributes of the Arabic language.”

The research stock at the Arab Center for Research and Policy Studies includes thousands of academic research papers and studies since its founding in 2010. It has rich and extensive content in Arabic, particularly in its project the Doha Historical Dictionary.

The Arab Center for Research and Policy Studies has an extensive collection of thousands of academic papers published in peer-reviewed journals, along with hundreds of original and translated books since its founding in 2010. ACRPS has rich and extensive content in Arabic, particularly in its project the Doha Historical Dictionary of Arabic Language. What distinguishes ACRPS content is its focus on Arab and Islamic issues from a genuine local perspective, which will enable linguistic models to perform in a manner that respects the culture of Qatar and adheres to its higher values and issues.

Dr. Fadi Zaraqat, head of the Digital and Arabic Social Field Unit at ACRPS, emphasized the importance of building Arabic LLMs. He stated that developing such models is central to the center’s mission, with their own models being developed under the name DALLA: Doha Arabic Large Language Model, based on open-source prototypes. Dr. Zaraqat highlighted the necessity of developing local, culturally aware Arabic models for various knowledge sectors, particularly in education and media. He pointed out that the widespread use of these models in these sectors would influence the spread and dominance of content and the value systems embedded in the linguistic models. ACRPS views its contribution to building these models as a core interest and it expressed its appreciation to MCIT’s initiative in this domain.

As the founder of Fanar project, the Qatar Computing Research Institute seeks to provide innovative and advanced solutions in artificial intelligence. It also works to develop projects that enhance technical capabilities and contribute to achieving scientific and research development in Qatar and the region. Through the Fanar project, the institute aims to apply the latest technologies in artificial intelligence to improve the quality of Arabic content in AI.

For his part, Dr. Ahmed K. Elmagarmid, Acting Vice President for Research at Hamad Bin Khalifa University and, Executive Director, Qatar Computing Research Institute (QCRI), said: “We are proud of the participation of four major national institutions in the Arab Model for Generative Artificial Intelligence (Fanar) project, which represents a qualitative leap in supporting and enhancing The project, as the availability of high-quality data resources is fundamental to creating and improving language technologies, stressing that the Qatar Computing Research Institute plays an important role in developing Arabic language processing within its primary mission of ensuring its prosperity in the digital world, by conducting advanced research into its technologies, producing tools, and creating resources. Linguistic communication and supporting the development of its solutions and applications, which had a tangible impact in increasing awareness about Arabic language technologies and their contribution to the development of the knowledge economy in the State of Qatar and Arabic-speaking communities around the world.”

This cooperation comes as an important step towards achieving Qatar National Vision 2030, demonstrating leadership in AI and promoting the Arabic language in the digital realm. It aims to develop accurate Arabic LLMs that align with our Arab values, culture, and heritage. The data offered by this cooperation will enhance Fanar’s capabilities in creating better and more accurate Arabic content, enabling the development of Arabic AI solutions such as machine translation, speech recognition, and text summarization. Furthermore, this initiative will support scientific research in AI.