Indian Dataset for Data Science Projects
Data Sources
Slide from Ai- For Everyone by Deeplearning.ai on Coursera by Andrew Ng
Starting on my Plan to Build real-world project that serves a function solvepao.com the main initial data I needed was GIS data of various coordinates of public utility like school, education, Hospitals, etc for India to start with. These Data sources for other countries are easily available online and used as learning dataset for beginner data science projects and hosted on all major public data set repository, apparently for India there is a rise in the initiative by the government to make the data publicly available, and the NIC (National Informatics Center) working on various projects to make that data more accessible but the GIS data I was looking for is only available in the public domain to preview using Bharat Map but not for download, while I was Googling for Data but I did compile a list in the process
Few really good examples to check out are :
- https://stategisportal.nic.in/stategisportal
- https://bhuvan.nrsc.gov.in/bhuvan_links.php
- https://indiabiodiversity.org/?lang=en
PS: If you look for other specific data like IRIS data for Indian citizens, street signs, handwriting, and other biomedical data that can be found on various University project sites in India
here is compiled a list of various data sources one can check out when working on the next data science project :
1. Data.Gov.In
It publishes datasets, documents, tools, and applications collected by the government for public use and community participation of the products with visualization, APIs, alerts, etc. It is also a collection of all the government based datasets discussed Below.
Link: https://data.gov.in
2. National Portal of India
It was developed by the Indian government to facilitate single window access to information and services of all government entities, A single-point access to a lot of information, has a searchable contact directory, a database of the government website, and others
Link: https://www.india.gov.in
3. Ministry Of Statistics And Programme Implementation Dataset
The datasets are collected by conducting large-scale sample surveys across India for various parameters, which eventually leads to the creation of the database. The ministry applies standard statistical techniques and large inspection and supervision to enable this.
Link: https://www.mospi.gov.in/data
4. RBI Database Of Indian Economy
It is loaded with suitable information and data for researchers, analysts, and general users all alike. It has datasets across money and banking, financial markets, national income, saving and employment, and others. The idea is to make easy present-day styles of data analysis that can provide important real-time numbers about economic activity, prices, and more.
Link: https://dbie.rbi.org.in/DBIE/dbie.rbi?site=home
5. Gateway To Indian Earth Observation
An initiative by ISRO, the open data archive(records) provides free satellite data, products download facility, and thematic datasets. It uses a crowdsourcing approach to collect enriching and point-of-interest data. It also acts as a platform to host government data such as the forest department. Apart from being a repository of data, it allows users to explore the 2D and 3D representation of the surface of the earth, pest surveillance, disaster services, high-resolution imagery of cities, among others.
Link: https://bhuvan.nrsc.gov.in/bhuvan_links.php#
6. India Weather Data
With datasets for various meteoroid indicators(measures), water resource planning, rainfall, and others from across various parts of India, these datasets are available for users in simple formats. It also contains databases for several other parameters such as temperature, pressure, relative humidity, precipitation amount, wind speed, solar radiation, among others.
Link (Freemium) : https://www.meteoblue.com/en/weather/archive/export/india_el-salvador_3585481
7. Aadhaar Metadata
This provides a huge database generated by the daily count of total registrations, enrolment applications accepted and rejected by state and district. It also contains other details such as Aadhaar generated by age, gender, etc
Link: https://data.gov.in/dataset-group-name/aadhaar
8. Import Exports Datasets
The Indian Customs Electronic Commerce(IceGate)/Electronic Data Interchange Gateway is a portal with e-filling services for trade and cargo carriers. It also has an in-depth National Import Database (NIDB) and Export Commodity Database (ECDB) for the Directorate of valuation that is being handled by the IceGate. It has information such as documents, messages, and other processes by the customs end by the Indian Customs EDI System (ICES)
Link: https://www.icegate.gov.in/jsp/DailyReport.jsp
9. Wildlife Institute of India Dataset
An autonomous institution under the Ministry of Environment Forest and Climate change, the Government of India, has datasets on different wildlife species in India. There are a total of 4591 specimens that are housed at WII herbarium, of which 4322 are digitized and published through the GBIF network. The data is mainly used by researchers and field managers from the respective protected areas of the country to prepare for the management plan and other research.
Link : WII Herbarium Dataset, GBIF.
10. Open Data Telangana
Various Data sets on multiple domains, Industry, Sectors, and Agricultural Data for the State of Telangana
Link: https://data.telangana.gov.in/search/type/dataset
11. National Data Repository
A Data repository on Seismic Data, Well & Log Data, Spatial Data, Other G&G data like Drilling, Reservoir, Production, Geological, Gravity & Magnetic, etc. Reports and Documents
Link: https://www.ndrdgh.gov.in/NDR
12. Indiastat
A Private owned site that combines and categorize data from the above data sources
Link: https://www.indiastat.com
13. India Biodiversity Portal
Free and open access to India’s biodiversity information A unique repository of information on India’s biodiversity. The Portal aims to aggregate data through public participation and provide open and free access to biodiversity information
Link: https://indiabiodiversity.org/?lang=en
14. National Health Systems Resource Center
A Technical Support Institute with National Health Mission with datasets on Healthcare
Link: https://nhsrcindia.org/health-systems-database
15. Central Library Indian Statistical InstituteLibrary
In line with the objectives of the institute, over the years, the library has developed a comprehensive collection of peer-reviewed scholarly literature useful for the faculty and the research community of the institute. The other objective is to serve as a resource center for the scholars and scientific community of the country.
Link: https://www.isical.ac.in/~library/data.php
16. Bharat Map
NIC/DeitY has created a Multi-Layer GIS Platform named “Bharat Maps” which depicts core foundation data as “NICMAPS”, an integrated base map service using 1:50,000 scale reference data from Survey of India, ISRO, FSI, RGI, and so on. This encompasses 23 layers containing administrative boundaries, transport layers such as roads & railways, forest layers, settlement locations, etc., including terrain map services.
Link: https://stategisportal.nic.in/stategisportal
Link: https://bharatmaps.gov.in
Link: https://bhuvan.nrsc.gov.in/bhuvan_links.php
- Find or Gather your own data and try to make some use of out it , the main learning lies in the process