Data.gov a catalog for USA data

The USA government produces and gathers a lot of data, but how to find it? Over the last decade, the United States executive branch and later Congress have required the government to publish the data on the web. data.gov provides a search engine for accessing that data.

The OPEN Government Data Act makes Data.gov a requirement in statute, rather than a policy. It requires federal agencies to publish their information online as open data, using standardized, machine-readable data formats, with their metadata included in the Data.gov catalog.

Open Government – Data.gov

This post is a departure from my usual. I am a technical person and love to look at code. But this information which I found when looking for data samples is actually extremely useful for non-technical people who need to do their jobs. Almost every business could benefit from more data. And much of the data on this site is accessible to people with a variety of skills.

So now data.gov harvests information regularly about datasets available across the US government, using common metadata. They provide links to Web Sites, APIs, and FTP sites that provide access to data in various formats, from html, csv (comma delimited text), XML, JSON as well as several other formats. In addition, other American government agencies (from state to city) can submit their data to the catalog, though the metadata is less consistent.

There are approximately 250K data sets available in the data.gov. Topics range from “Arctic” to “Water.” The data comes from various departments from the Department of Agriculture to Veterans Affairs. I won’t go into detail about using the search engine (User Guide – Data.gov does a good job). But here are some of the interesting parts of the page that I wanted to highlight.

The data.gov home page provides several links.

The data.gov home page, displaying Most Viewed Datasets, Recently Added Datasets, Datasets by Organization, and Geospatial.  Also a search button is inclued.

You do a quick search to find items by the following categories or you can enter a search time in the search textbox

  • Most Viewed Datasets
  • Recently Added Datasets
  • Datasets by Organization
  • Geospatial

On the home page, click on the Data menu item. This brings you to Dataset – Catalog.

https://catalog.data.gov/dataset, showing a search catalog for the data.gov site

I can type the word “physician” in the search text box I get 96 results. I get a short summary of the contents, with identification of the file types. It also identifies what level of government the information comes from. Also, you can get an idea of how popular the dataset is.

Displaying the results of the data.gov search for physician.  Provides menu items for filtering by topics, department as well as tags.

In addition, you filter by other information.

  • Geographic location
  • Topics
  • Topic Categories
  • Dataset type
  • Tags
  • Formats
  • Organization types
  • Organizations
  • Publishers
  • Bureaus

This post is a departure from my usual. I am a technical person and love to look at code. But this information which I found when looking for data samples is actually extremely useful for non-technical people who need to do their jobs. Almost every business could benefit from more data. And much of the data on this site is accessible to people with a variety of skills.

For my less technical friends, you can inform your decision making with free data.

  • Demographics
  • Economy
  • Job situation

For my technical friends, you can find data for testing various technology issues.

  • Accessing APIs
  • Different file formats
  • Changing metadata

Let me know how you have used these datasets in the comments.

Leave a Reply