Indian Dataset for Data Science Projects

Data Sources

harshityadav95 Slide from Ai- For Everyone by on Coursera by Andrew Ng

Starting on my Plan to Build real-world project that serves a function the main initial data I needed was GIS data of various coordinates of public utility like school, education, Hospitals, etc for India to start with. These Data sources for other countries are easily available online and used as learning dataset for beginner data science projects and hosted on all major public data set repository, apparently for India there is a rise in the initiative by the government to make the data publicly available, and the NIC (National Informatics Center) working on various projects to make that data more accessible but the GIS data I was looking for is only available in the public domain to preview using Bharat Map but not for download, while I was Googling for Data but I did compile a list in the process

Few really good examples to check out are :


PS: If you look for other specific data like IRIS data for Indian citizens, street signs, handwriting, and other biomedical data that can be found on various University project sites in India

here is compiled a list of various data sources one can check out when working on the next data science project :

1. Data.Gov.In

It publishes datasets, documents, tools, and applications collected by the government for public use and community participation of the products with visualization, APIs, alerts, etc. It is also a collection of all the government based datasets discussed Below.


2. National Portal of India

It was developed by the Indian government to facilitate single window access to information and services of all government entities, A single-point access to a lot of information, has a searchable contact directory, a database of the government website, and others


3. Ministry Of Statistics And Programme Implementation Dataset

The datasets are collected by conducting large-scale sample surveys across India for various parameters, which eventually leads to the creation of the database. The ministry applies standard statistical techniques and large inspection and supervision to enable this.


4. RBI Database Of Indian Economy

It is loaded with suitable information and data for researchers, analysts, and general users all alike. It has datasets across money and banking, financial markets, national income, saving and employment, and others. The idea is to make easy present-day styles of data analysis that can provide important real-time numbers about economic activity, prices, and more.


5. Gateway To Indian Earth Observation

An initiative by ISRO, the open data archive(records) provides free satellite data, products download facility, and thematic datasets. It uses a crowdsourcing approach to collect enriching and point-of-interest data. It also acts as a platform to host government data such as the forest department. Apart from being a repository of data, it allows users to explore the 2D and 3D representation of the surface of the earth, pest surveillance, disaster services, high-resolution imagery of cities, among others.


6. India Weather Data

With datasets for various meteoroid indicators(measures), water resource planning, rainfall, and others from across various parts of India, these datasets are available for users in simple formats. It also contains databases for several other parameters such as temperature, pressure, relative humidity, precipitation amount, wind speed, solar radiation, among others.

Link (Freemium) :

7. Aadhaar Metadata

This provides a huge database generated by the daily count of total registrations, enrolment applications accepted and rejected by state and district. It also contains other details such as Aadhaar generated by age, gender, etc


8. Import Exports Datasets

The Indian Customs Electronic Commerce(IceGate)/Electronic Data Interchange Gateway is a portal with e-filling services for trade and cargo carriers. It also has an in-depth National Import Database (NIDB) and Export Commodity Database (ECDB) for the Directorate of valuation that is being handled by the IceGate. It has information such as documents, messages, and other processes by the customs end by the Indian Customs EDI System (ICES)


9. Wildlife Institute of India Dataset

An autonomous institution under the Ministry of Environment Forest and Climate change, the Government of India, has datasets on different wildlife species in India. There are a total of 4591 specimens that are housed at WII herbarium, of which 4322 are digitized and published through the GBIF network. The data is mainly used by researchers and field managers from the respective protected areas of the country to prepare for the management plan and other research.

Link : WII Herbarium Dataset, GBIF.

10. Open Data Telangana

Various Data sets on multiple domains, Industry, Sectors, and Agricultural Data for the State of Telangana


11. National Data Repository

A Data repository on Seismic Data, Well & Log Data, Spatial Data, Other G&G data like Drilling, Reservoir, Production, Geological, Gravity & Magnetic, etc. Reports and Documents


12. Indiastat

A Private owned site that combines and categorize data from the above data sources


13. India Biodiversity Portal

Free and open access to India’s biodiversity information A unique repository of information on India’s biodiversity. The Portal aims to aggregate data through public participation and provide open and free access to biodiversity information


14. National Health Systems Resource Center

A Technical Support Institute with National Health Mission with datasets on Healthcare


15. Central Library Indian Statistical InstituteLibrary

In line with the objectives of the institute, over the years, the library has developed a comprehensive collection of peer-reviewed scholarly literature useful for the faculty and the research community of the institute. The other objective is to serve as a resource center for the scholars and scientific community of the country.


16. Bharat Map

NIC/DeitY has created a Multi-Layer GIS Platform named “Bharat Maps” which depicts core foundation data as “NICMAPS”, an integrated base map service using 1:50,000 scale reference data from Survey of India, ISRO, FSI, RGI, and so on. This encompasses 23 layers containing administrative boundaries, transport layers such as roads & railways, forest layers, settlement locations, etc., including terrain map services.




  • Find or Gather your own data and try to make some use of out it , the main learning lies in the process
This post is licensed under CC BY 4.0 by the author.