NEmail addressDescriptionLink to the dataInstructions
1blee21@stuy.eduThe dataset provides sales records of a certain good for different countries.
It provides details on info like the order date, unit price, and profit of
that specific item.
http://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/There are files of different sizes depending on the amount of sales records you
want to have on each csv file. Click on the size you want to download it. Some
parts of data are blurred out in the ship date and order date.
2sliu21@stuy.eduminecrafthttp://minerl.io/dataset/no
3cchenwu20@stuy.eduThe data is about the PISA assessment and can display the overall scores of
different countries in categories like math, reading, and science.
http://nces.ed.gov/surveys/pisa/idepisa/report.aspx?p=1-RMS-1-20153,20123,20093,20063,2003
3,20003-PVMATH-TOTAL-AUS,AUT,BEL,CAN,CHL,CZE,DNK,EST,FIN,FRA,DEU,GRC,HUN,ISL,IRL,ISR,ITA,J
PN,KOR,LVA,LUX,MEX,NLD,NZL,NOR,POL,PRT,SVK,SVN,ESP,SWE,CHE,TUR,GBR,USA-MN_MN-Y_J-0-0-37&La
ng=1033
The website, https://nces.ed.gov/surveys/international/ide/, also allows the
user to choose certain variables or criteria that they want to see about the
data and get a report containing all the data that they want. For example, for
my particular set of data I chose to display all years, but you can also select
only one year to display.
4glin22@stuy.eduesports league statisticshttps://assets.blz-contentstack.com/v3/assets/blt321317473c90505c/blt85c849e5a9c818e1/5e9d
e9aaeea3744a375d6b6a/match_map_stats.zip
Organized by stage of competition, team that won/lost each game, map played,
score per game, overall match winner
5aalryyes20@stuy.eduConnecticut's Accidental Drug Related Deaths from 2012-2016.https://catalog.data.gov/dataset/accidental-drug-related-deaths-january-2012-sept-2015This dataset provides a lot of information that could be analyzed, from age and
sex of the victims to the drug used.
6ddeng20@stuy.eduThe data details defense exports made by the US government to various countries
in terms of monetary value. I found this interesting due to the whole variety
of countries that I did not expect to be on this list such as Aruba and the
Cayman Islands.
https://catalog.data.gov/dataset/directorate-of-defense-trade-controls-2016-section-655-re
port
From the link, scroll down to download the pdf file.
7scheyney20@stuy.eduNumber of dog names in Anchorage Municipalityhttps://catalog.data.gov/dataset/dog-namesDownload from the link (there are CSV and other forms available).
8ywu20@stuy.edunames of babieshttps://catalog.data.gov/dataset/most-popular-baby-names-by-sex-and-mothers-ethnic-group-n
ew-york-city-8c742/resource/db4168b5-dbec-4b71-8c80-17b5ab4350e8
click download on the top right corner to download.
9mli20@stuy.eduThis data set described the salaries of different occupations in New York State.https://catalog.data.gov/dataset/occupational-employment-statisticsSimple click the download button next to the CSV file or any other type of file
you would like. This data set describes the mean wage, median wage, and entry
wage of different occupations in different parts of NY along with their
standard occupational code.
10nkuo20@stuy.eduContains graduation outcomes for the state of New York from 2005 to 2015https://catalog.data.gov/dataset/regents-exam-resultsCan be found on data.gov. Also includes file formats for RDF, JSON, and XML
11ilam20@stuy.eduAssessment of how properties in NY pay taxes.https://chriswhong.com/open-data/liberating-data-from-nyc-property-tax-bills/This is a very big file, with 1048576 rows, and there was an error message that
it was too big so I assume there are likely more rows.
12sliu21@stuy.educonnecticut votershttps://connvoters.com/download.htmlhere's michigan: https://michiganvoters.info/download.html and here's every
state's rules on voter registration lists:
https://www.ncsl.org/research/elections-and-campaigns/access-to-and-use-of-voter
-registration-lists.aspx get the rest yourself
13vveytsman20@stuy.edu"CVE® is a list of entries—each containing an identification number, a
description, and at least one public reference—for publicly known cybersecurity
vulnerabilities." - from their site
https://cve.mitre.org/data/downloads/index.htmlThis data is used by the US National Vulnerability Database, but that database
doesn't have CSV downloads, only JSON and XML.
14bzhang20@stuy.eduLarge group of businesses that were charged for not very legal activites, along
with info about their location , name, date of charged, reason for being
charged, and result from trial
https://data.cityofnewyork.us/Business/Charges/5fn4-dr26Go to Business, then datasets, and it's the 9th one (for me at least)
15jsun20@stuy.eduThis shows consumer complaints against businesses and the reason for the
complaint
https://data.cityofnewyork.us/Business/Consumer-Services-Mediated-Complaints/nre2-6m2sit has a restitution column which shows how much the person that complained was
compensated for
16mquiroz20@stuy.eduThe data that I found was about consumer complaints against businesses that
were mediated by the DCA Consumer Services Division during the last and current
calendar years or general complaints against businesses.
https://data.cityofnewyork.us/Business/Consumer-Services-Mediated-Complaints/nre2-6m2s/dat
a
I thought the data in the data set was very interesting because not only are
you given the complete address, but also the longitude and latitude. Therefore
it is possible to get the longitude and latitude assign the business name with
the numbers and maybe plot it on a map.
17ali20@stuy.eduCitywide Payroll datahttps://data.cityofnewyork.us/City-Government/Citywide-Payroll-Data-Fiscal-Year-/k397-673eJust a normal csv file.
18dsooknanan20@stuy.eduAverage class sizes of certain classes in schools in 2010-2011 (I even found
some data about Stuyvesant).
https://data.cityofnewyork.us/Education/2010-2011-Class-Size-School-level-detail/urz7-pzb3If you want the csv file just hit the export button in the top right and click
"csv."
19llottman20@stuy.edu2010 AP results for NYC schoolshttps://data.cityofnewyork.us/Education/2010-AP-College-Board-School-Level-Results/itfs-ms
3e
there's no paywall and it can be downloaded as a csv file
20sanand20@stuy.eduThe data covers 258 schools in NYC and provides information regarding how many
AP test takers were in that school in 2010, how many APs were taken in the
respective school, and the number of exams with a score of 3, 4, or 5.
https://data.cityofnewyork.us/Education/2010-AP-College-Board-School-Level-Results/itfs-ms
3e
Many cells are left blank if no data is available or if the number is 0,
instead of the actual number 0 or N/A.
21lsmulansky20@stuy.edu# of APs taken in various schools and # of passing gradeshttps://data.cityofnewyork.us/Education/2012-AP-Results/9ct9-prf9There's a lot of lines with bad data ('s') and there's also a few schools with
commas/double quotes
22acai20@stuy.eduMy data is of 2012 AP results. It lists the school, numbers of AP test takers,
number of AP tests taken, and the number of AP tests passed.
https://data.cityofnewyork.us/Education/2012-AP-Results/9ct9-prf9It has very similar problems as the class example such as 's' instead of
numbers, and the website does not do a good job sorting the information.
23ccursetjee20@stuy.eduSchool Student Demographics 2013-18https://data.cityofnewyork.us/Education/2013-2018-Demographic-Snapshot-School/s52a-8aq6Press Export and then download as a CSV file.
24jlin20@stuy.eduthis is a VADIR (Violent And Disruptive Incidents Report) for schools in New
York
https://data.cityofnewyork.us/Education/2014-2015-VADIR-INCIDENTS-3/diks-hcwd/dataI noticed that as there was a different column for each crime, there were much
more columns than in other lists we've seen
25mchan20@stuy.edu2016 School Safety Reporthttps://data.cityofnewyork.us/Education/2016-2017-School-Safety-Report/rear-wh5i/dataThe data is exportable to CSV files, so you can just download it to your
computer.
26mraskin20@stuy.eduSHSAT offers by New York City Middle School (2017-18)https://data.cityofnewyork.us/Education/2017-2018-SHSAT-Admissions-Test-Offers-By-Sending-
/vsgi-eeb5
This data set is very easy to access, as the data is presented very clearly,
and all of the columns are concepts we are familiar with (Middle Schools,
Offers, Applicants, etc.) One feature I really enjoyed exploring was the
visualization feature, which allows the user to click through different types
of graphs and see the data represented in a variety of ways. The most helpful
graph is probably the Combo Chart, because it clearly presents the rows of
middle schools, and their numbers in descending order. Highly recommend taking
a look!
27ofishman20@stuy.eduShsat school acceptanceshttps://data.cityofnewyork.us/Education/2017-2018-SHSAT-Admissions-Test-Offers-By-Sending-
/vsgi-eeb5
no
28apatel20@stuy.edu2017 - 2020 Monthly Grade Level Attendance by Schoolhttps://data.cityofnewyork.us/Education/2017-2020-Monthly-Grade-Level-Attendance-by-School
/xuid-t5nk/data
Can be accessed as a csv file. Really simple to understand, with 10 columns,
containing details about the school, the month, and the number of absences and
presences each month. It is published by the DOE, and it is updated monthly.
29bguzman20@stuy.eduList of Schools Receiving a Quality Review in 2018-2019https://data.cityofnewyork.us/Education/2018-2019-Quality-Review-School-List/2rr4-rfvc/dat
a
Search for list of school names in NYCOpenData, the first link.
30sding20@stuy.eduThis set of data is released by NYC DOE. It's about the number of eighth
graders, number of SHSAR test takers, and the number of students who got offers
from specialized high schools in each middle school of NYC in the year of 2018
to 2019.
https://data.cityofnewyork.us/Education/2018-2019-SHSAT-Admissions-Test-Offers-By-Sending-
/uf53-ree9
This set data is interesting because it's about SHSAT admission numbers for the
year 2018-2019, which is the year we (as class of 2022) took the SHSAT and
entered specialized high school. It shares a similar format with the 2010-SAT
data we worked on before.
31amai20@stuy.eduStudent attendance per school from 2017-2020https://data.cityofnewyork.us/Education/2019-2020-SHSAT-Admissions-Test-Offers-By-Sending-
/xuij-x4t4
N/A
32ckastoun20@stuy.eduNumber of offers into specialized high schools 2019-2020https://data.cityofnewyork.us/Education/2019-2020-SHSAT-Admissions-Test-Offers-By-Sending-
/xuij-x4t4
• It contains the name of the school, the school, the number of students in
high school admissions, the number of students who took the SHSAT, and the
total number of offers into specialized high schools for that year. • Whenever
the number of test-takers or students accepted is 5 or lower, an exact value is
not provided and instead, the range 0-5 is given. • The same issues seen in
class arise with the built-in sorting mechanism of the website: 20 is placed
above 100 when the data is sorted in "descending" order because it's sorting
strings and not numbers
33jlin27@stuy.eduFree meals in time of crisishttps://data.cityofnewyork.us/Education/COVID-19-Free-Meal-Locations/2rg5-7s3wDunno
34rjeffreys20@stuy.eduA table with information about squirrel sightings (location, date, color, etc.)https://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw
None
35jshen20@stuy.eduThe data is the record of all the spotted squirrels in central park. There is
information about the location spotted, physical attributes, and the action
that the squirrel was spotted doing.
https://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw
The data set is relatively small; it only has about 3000 rows and each row
represents a different squirrel.There are 31 columns and the data types stored
are as integers, floats, strings, true/false (boolean), and location
(longitude/latitude). There are a lot of entries that contain missing data
especially the ones that are true/false. The data was last updated October 2019.
36gchen20@stuy.eduDescription of Squirrels spotted in Central Parkhttps://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw
I found this in the Environment section of NYC OpenData and scrolled down. It
includes a color map, which is helpful in visualizing the data. It includes
location, color, and action of squirrels.
37mzitu20@stuy.eduinfo on many of the squirrels in prospect parkhttps://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw
specifies color, age, location found, and what they were doing when they were
found.
38jmackenroth20@stuy.eduIt's a squirrel census, which records the number of sightings and details of
squirrels (location coordinates, age, primary and secondary fur color,
elevation, activities, communications, and interactions).
https://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw
You can download it as a CSV file. The species analyzed by this Squirrel census
is the Eastern gray (Sciurus carolinensis).
39sliu21@stuy.edusquirrels in central parkhttps://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw
no
40eyu20@stuy.eduThe database I found, was of a squirrel census in Central Park in 2018. The
data was collected in the name of a project titled, The Squirrel Census. The
Squirrel Census is a multimedia science, design and storytelling project that
focuses specifically on the Eastern Gray Squirrel. The squirrels are counted in
the database and further details regarding each squirrel is listed as well. For
example, the file provides data on a specific squirrel's fur color, location
coordinates, age, elevation, activities, communications and interactions with
fellow squirrels and other life forms in Central Park.
https://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw/data
To look at the data without downloading just click the 'View Data' button and
to download the file as a csv (or other) file click the 'Export' button.
41ijiang20@stuy.eduThe data describes the squirrel population in Central Park, listing their age,
primary fur color, location, as well as movements (running, chasing, climbing,
eating) that are listed as True or False.
https://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/
vfnx-vebw/data
To access the data, export/download it as a CSV or TSV file.
42dlyalin20@stuy.eduA very detailed account of all the squirrels in Central Park, including where
they are and what they look like (secondary fur color). This is part of a
project called, appropriately enough, Squirrel Census. Also, quite
surprisingly, this came up when I searched "food." I would rather not know why.
https://data.cityofnewyork.us/Environment/2018-Squirrel-Census-Fur-Color-Map/fak5-wcftHovering over a circle lets you see more information about it. Pretty easy to
use.
43dhu20@stuy.eduAir quality (many metrics) of many areas in NYChttps://data.cityofnewyork.us/Environment/Air-Quality/c3uy-2p5rIt's a CSV
44jjiang20@stuy.eduThe green infrastructure put around the city.https://data.cityofnewyork.us/Environment/DEP-Green-Infrastructure/spjh-pz7h#revertclicking the export tab u can download it as a csv file
45vchang20@stuy.eduThis is a website that shows all the places where you can drop off food scraps.
It gives the location of the areas to the longitude and latitude. It also shows
the hours of operation.
https://data.cityofnewyork.us/Environment/Food-Scrap-Drop-Off-Locations-in-NYC/if26-z6xqYou have to download it as a csv file? The attachment listed is not usable.
46bwu20@stuy.eduSea levelshttps://data.cityofnewyork.us/Environment/Sea-Level-Rise-Maps-2050s-500-year-Floodplain-/q
wca-zqw3
Besides the map of New York City are tabs. To access the data, click on the
export tab and scroll down to find a csv file.
47rjeffreys20@stuy.eduInformation about calls for animal assistance, relocation, and/or rescue
completed by the Urban Park Rangers.
https://data.cityofnewyork.us/Environment/Urban-Park-Ranger-Animal-Condition-Response/fuhs
-xmg2
Some of the dates are just a series of hashtags and not all of the cells are
filled out.
48eshi20@stuy.eduWater Consumption in NYC - How much water New Yorkers are using, 1979-2008https://data.cityofnewyork.us/Environment/Water-Consumption-In-The-New-York-City/ia2d-e54mIt can be exported in .csv format. It's pretty straightforward and from a quick
glance, NYC is using less water per day even despite its growing population.
Maybe another dataset could give a clue to how..
49ayuen20@stuy.eduNew York water consumptionhttps://data.cityofnewyork.us/Environment/Water-Consumption-In-The-New-York-City/ia2d-e54m
/data
click on link, export, and download as csv. It states population every year;
you can see if that correlates with water consumption.
50jwang20@stuy.eduRestaurant Inspection Results; reports based on how hygenic a restaurant is.https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/rs6
k-p7g6
You can search by type of violation, zip code, borough where violation occured,
type of cuisine, just a list of every restaurant that received a citation, and
much more. More importantly, the grading system is as follows: Grades A, B, and
C as expected, then N/Z for grade pending or ungraded, then P for grade pending
after an inspection resulted in a closure of the restaurant (basically an F in
our system).
51djian20@stuy.eduThis data reports the number of HIV/AIDS diagnoses by neighborhood, sex, and
race/ethnicity in NYC from 2010 to 2013.
https://data.cityofnewyork.us/Health/HIV-AIDS-Diagnoses-by-Neighborhood-Sex-and-Race-Et/yk
vb-493p
There's a lot of interesting ways for Python to process the data but there are
data where there's either "All" races or "Unknown" race as well as "All" sexes.
The data is also separated by year from 2010 to 2013.
52rfaruk20@stuy.eduInfant Mortality rate by maternal race/ethnicityhttps://data.cityofnewyork.us/Health/Infant-Mortality/fcau-jc6k/dataThe data has some sections labeled Other/Two or More for the ethnicity category
that has no numbers for the mortality rates.
53kdudani20@stuy.eduThis data states the concentration of mercury and other heavy metals present
within consumer products.
https://data.cityofnewyork.us/Health/Metal-Content-of-Consumer-Products-Tested-by-the-N/da
9u-wz3r
There are some pieces of data where it is stated that there is a negative
concentration of the metal in question, which is odd.
54ewu21@stuy.eduInformation about dogs owned in NYC. Names, gender, breed, DOB, and borough
that the dog resides in.
https://data.cityofnewyork.us/Health/NYC-Dog-Licensing-Dataset/nu7n-tubpThe information is updated yearly and it's probably possible to find your dog
on the list if you own one
55twang20@stuy.eduThis is a chart of hospitals, questions about the hospital, one answer, and the
percent of people that got that answer.
https://data.cityofnewyork.us/Health/NYC-Health-Hospitals-patient-satisfaction-scores-2/hi
3x-y76v
Click the link, scroll all the way down. Click on view data. Then on the right
side there will be multiple options. Click on export. Under the export option
there will be a download option. Click csv.
56fnemati20@stuy.eduNYC leading causes of deathhttps://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtam7 columns: year, leading cause, sex, race ethnicity, deaths, death rate, age
adjusted death rate; there are periods where there is no data in the last 3
columns. Last 3 columns are integers/floating point numbers.
57mkhan21@stuy.eduPopulation Stats: Causes of Death per Year divided by sex and ethnicity.https://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtamThe data is not in an easy to see order it is somewhat random: It doesn't go
F,Hispanic to M,Hispanic to F,Asian. So in order to get a visual idea you would
need to sort the data pretty well.
58nlin21@stuy.eduThis dataset provides the leading causes of death in New York City from
2007-2014.
https://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtamOne of the columns in this data set is "Age Adjusted Death Rate", which is a
death rate that accounts for ethnicity, racial, and population age differences.
59skim20@stuy.eduNew York City's leading causes of deathhttps://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtam. Last updated on February 2020, but only includes information up until year
2014. There are 1094 rows and 7 columns
60rkondo20@stuy.eduLeading cause of deathhttps://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtamJust click view data. There are 7 rows and it provides with many things u
probably need.
61apurohit20@stuy.edunyc leading causes of deathhttps://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtamthe data is organized by race, sex, and death rate (both normal and age
adjusted)
62hchen20@stuy.eduLeading Causes of Death in NYChttps://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtamVisualize->Dimension->Leading Cause; bar chart or column chart for best
visualization
63eyong20@stuy.eduNew York City Leading Causes of Deathhttps://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtamTo download the data, go on the website and press export, which will allow you
to download the data in different formats.
64kchen20@stuy.eduLeading Causes of Deadhttps://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtam/dataYou can filter the data for just a specific group of people (sex and race) or
for a specific year. For example, for leading cause of in Asian and Pacific
Islander women in 2014 was cancer.
65dkong20@stuy.eduThe data gives a representation of where people have been getting vaccines and
additional information about the patient that could give hints towards their
socioeonomic status.
https://data.cityofnewyork.us/Health/New-York-City-Locations-Providing-Seasonal-Flu-Vac/w9
ei-idxz
26 columns and 885 rows.
66rbalakrishnan20@stuy.eduMost popular baby names.https://data.cityofnewyork.us/Health/Popular-Baby-Names/25th-nujfYou can download the data if you click on the link and click "export" and then
"CSV" (like most of the files on the site). The data lists the year of birth,
gender, ethnicity, name, number of people with the name, and popularity rank.
67mlin20@stuy.eduThe data represents a ranking of baby names in New York City by frequency. The
data includes ethnicity, gender, baby count for each name, and year.
https://data.cityofnewyork.us/Health/Popular-Baby-Names/25th-nujfTo access the data, you can click on the export button and choose the format
you would like to download it in. In addition, there is something really
interesting about this data set. You can create a visualization and view it in
many different formats ad graphs.
68aruparel20@stuy.eduThe data describes popular birth names and their genders. I used the NYC Open
Data info for this
https://data.cityofnewyork.us/Health/Popular-Baby-Names/25th-nujfYou can download this as a csv file
69owang20@stuy.eduPopular baby nameshttps://data.cityofnewyork.us/Health/Popular-Baby-Names/25th-nujfEach sex and race has its own rankings. For example, male hispanics and female
hispanics each have their own ranking list.
70dvoynov20@stuy.eduPopular baby names for boys and girls of different ethnic groupshttps://data.cityofnewyork.us/Health/Popular-Baby-Names/25th-nujf/dataNo
71slee21@stuy.eduThe number of arrests per race each yearhttps://data.cityofnewyork.us/Public-Safety/Crime-Enforcement-Activity/qk6i-zchtThe button "Crime Enforcement Activity" lets you save the documents. The table
consists of percentages for each race.
72dwells20@stuy.eduThis data reports the admission of inmates into NYC Department of Correction
facilities. The 7 columns (from left to right) describe Inmate ID, Date of
Admission, Date of Discharge (if applicable), Race, Gender (M or F), Inmate
Status Code, and the code of the inmate's top legal charge.
https://data.cityofnewyork.us/Public-Safety/Inmate-Admissions/6teu-xtgpThis data can be downloaded as a CSV file. One should note that some of the
rows of data do not have complete information (e.g., the discharge date is
unavailable, because the inmate that the information pertains to has not yet
been discharged).
73jmei20@stuy.eduNYPD Arrest Datahttps://data.cityofnewyork.us/Public-Safety/NYPD-Arrest-Data-Year-to-Date-/uip8-fykc-Level of offense is sorted into felony (F), misdemeanor (M), and Violation
(V) -Data can also be sorted by race and sex of offender, borough, type of
crime, and age group, and even latitude and longitude (among other
things). >200K rows
74dflocos20@stuy.eduDirectory Of Toilets In Public Parkshttps://data.cityofnewyork.us/Recreation/Directory-Of-Toilets-In-Public-Parks/hjae-yuavThe download to the table of data is on the link of New York data.
75achoi21@stuy.eduEateries (e.g. snack bars, food carts, mobile food trucks, restaurants) in NYChttps://data.cityofnewyork.us/Recreation/Directory-of-Eateries/8792-ebcpYou could access its json file. Since python supports it, you can import json
and read it with the 'json.load()' method.
76elin20@stuy.eduNew York Public Library Statshttps://data.cityofnewyork.us/Recreation/New-York-Public-Library-NYPL-Branch-Services-from
-/ne9z-skhf
There is a lot of information that can be sorted with it.
77zbuff20@stuy.eduThe data gives addresses, phone numbers, and website links to theaters in NYC.https://data.cityofnewyork.us/Recreation/Theaters/kdu2-865wI downloaded it as a .csv file. It can be opened using gedit or TextEdit
(probably other textediting softwares too) but then all the data is smushed
together. To read it as a full readable table I opened it using the Numbers
application.
78mzhang21@stuy.eduits all the police calls in 2006 that requests their serviceshttps://data.cityofnewyork.us/Social-Services/311-Service-Requests-for-2006/hy4q-igkkThere are a lot of other columns for more specific requests or crimes that are
left blank, really only the first 10 or so columns have anything in them.
79pvonmueffling20@stuy.eduAbuse/neglect rate per countyhttps://data.cityofnewyork.us/Social-Services/Abuse-Neglect-by-Community-District-CD-/rnjn
-x48k
must download
80rbhuiyan20@stuy.eduShows the population of homeless people in an area of NYC.https://data.cityofnewyork.us/Social-Services/Directory-Of-Homeless-Population-By-Year/5t4
n-d72c/data
Press export -> Download -> and then download as CSV.
81jliu20@stuy.eduTaxi statshttps://data.cityofnewyork.us/Transportation/2015-Yellow-Taxi-Trip-Data/ba8s-jw6uComes with graphs as well, columns to note: Passenger count, trip distance,
fare amount, tips, and payment type
82eshan20@stuy.eduClosed street potholes inspected + repaired by NYCDOThttps://data.cityofnewyork.us/Transportation/Street-Pothole-Work-Orders-Closed-Dataset-/x9
wy-ing4
Sort by newest, scroll all the way down, and you can export as csv. It is a
little bit lengthy with 290,000 rows and 16 columns which is a lot to process.
83dfridlyand20@stuy.eduTraffic Volumehttps://data.cityofnewyork.us/Transportation/Traffic-Volume-Counts-2012-2013-/p424-amsuonly during 2012-2013, not recent. is a csv file if click export
84cchen20@stuy.eduLocations of farmers markets throughout NYC with the days they are open and the
operation times
https://data.cityofnewyork.us/dataset/DOHMH-Farmers-Markets/8vwk-6iz2/dataNote that most of the data are strings rather than numbers. You can still
analyze it though.
85eoo20@stuy.eduemission of varied greenhouse gases (kg per capita) by different countries over
time
https://data.oecd.org/air/air-and-ghg-emissions.htm#indicator-chartthe data can be downloaded as a csv file that contains the specific greenhouse
gas (or all the greenhouse gasses), kg per capita, country, and year.
86gwickham20@stuy.eduSan Francisco plant and ecosystem datahttps://data.sfgov.org/Energy-and-Environment/San-Francisco-Plant-Finder-Data/vmnk-skihClick "Export", next to "View Data" and "Visualize" then select csv
87xliuliu20@stuy.eduTraffic Speed Datahttps://data.world/ahalps/social-influence-on-shoppingThey make you sign up to their website to be able to access the data. So it is
kind of annoying, but it is in .csv format
88mzen20@stuy.eduDC voter registration datahttps://data.world/codefordc/dc-voter-registration-data/workspace/file?filename=20141218-d
c_voters.csv.zip
It asks you to make a free account (or sign in with Google), then you can
download it.
89rlee20@stuy.eduMany financial statisticshttps://data.worldbank.org/It's free
90mshahrier20@stuy.eduLists the GDP per capita for every recognized country in the world, for the
last approximately 50 years.
https://data.worldbank.org/indicator/NY.GDP.PCAP.CD?view=mapOn the rick, click download 'CSV file.'
91sliu21@stuy.edujk that wasn't the last one. here's one more dataset of datasets, including
(among other things) walruses, doggies, music, hate crimes, and food!
https://docs.google.com/spreadsheets/d/1wZhPLMCHKJvwOkP4juclhjFgqIY8fQFMemwKL2c64vkno
92sliu21@stuy.eduthe free encyclopedia, and other thingshttps://dumps.wikimedia.org/ok this is the last one
93cpyne20@stuy.eduTop Grossing Movies from 2007-11https://gist.githubusercontent.com/Beelat2018/29fb23175c5161797f17fc1f970824fb/raw/df29c04
3fdb865ac9d173303a0eebaa70af806e9/gistfile1.txt
Sourced from Github, user Beelat2018, alternate link here:
https://gist.github.com/Beelat2018/29fb23175c5161797f17fc1f970824fb
94sliu21@stuy.eduperfectly appropriate and clean wordshttps://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Wordsno
95sliu21@stuy.eduanother list of datasetshttps://github.com/awesomedata/awesome-public-datasetsno
96sliu21@stuy.eduweed priceshttps://github.com/frankbi/price-of-weed/tree/master/datano
97igraham20@stuy.eduCoronavirus data in the US by county, except for a few areas, which are by city.https://github.com/nytimes/covid-19-data/blob/master/us-counties.csvIt is only viewable on github as raw data.
98sliu21@stuy.edupersonalized license plateshttps://github.com/veltman/ca-license-platesno
99hscheuer20@stuy.eduData sets relating to health, compiled by the government. They have everything
relating to cost of hospital trips to stuff as recent as covid cases and tests.
https://healthdata.gov/It's a super easy website to navigate. You can search for keywords, locations,
or the date the data was last modified. Once you find a data set that interests
you, you can either preview it, or download it as a CSV.
100rshrestha20@stuy.eduThe data is about the NFL quarterbacks in the 2011 season. It contains their
names, the team they played on, and statistics about them. These statistics
include the amount of fumbles done during that season, amount of sacks, and
amount of rushing touchdowns. These are just three types of statistics included
in this file.
https://nathanbrixius.wordpress.com/2013/01/13/2012-nfl-statistics-by-player-and-team-in-c
sv-format/
After entering the link, you will see another link with the words “The files
are located here.” Clicking that link will take you to another page with
several files and folders. The file I downloaded was in the NFL folder. After
clicking on that folder, you will see more files and folders. The file I picked
is called “NFL 2011 QB.csv.” Just right click it (or control click on a Mac)
and download the file.
101sshuhan20@stuy.eduDescribes the current and past weather at any given location.https://openweathermap.org/apiYou need to import a json and http library to successfully use the API (which
also requires a key). First, create the link using the zipcode you want, and
then go to it to grab the json. Any fully implemented json parser can do the
job. ~15 LOC
102vqiu20@stuy.eduMy data is a report on the total amount of coronavirus cases of every country
by date starting from 12/31/2019-4/27/2020
https://ourworldindata.org/coronavirus-source-dataJust look for the one that says total confirmed cases. I chose this link
because this data relates to recent news and I just thought it'd be interesting
in general to see which countries became infected in relation to previously
infected countries.
103eandrews20@stuy.eduAgriculture statistics, including land use.https://quickstats.nass.usda.govWhen you retrieve data for an entry, pressing the "spreadsheet" button allows
you to obtain a .csv file of the data.
104sliu21@stuy.edumore datahttps://statweb.stanford.edu/~sabatti/data.htmlno
105sliu21@stuy.eduset of setshttps://vincentarelbundock.github.io/Rdatasets/datasets.htmlno
106xma20@stuy.eduHistoric surface temperature of Raleigh city, UShttps://www.chapelhillopendata.org/api/v2/catalog/datasets/earth-surface-temperature-data0
/exports/csv
when you open the link, it will send the csv data to your downloads. It has six
columns and is formatted very similarly to the SAT csv files we have gotten
before. Unlike before, if there is no data, there will not be an s to fill the
blank. The las three columns show the same data in every row.
107mmelucci20@stuy.eduThe data I found is about Citi Bike members and their rides.https://www.citibikenyc.com/system-dataIt includes the following: Trip Duration (seconds) Start Time and Date Stop
Time and Date Start Station Name End Station Name Station ID Station
Lat/Long Bike ID User Type (Customer = 24-hour pass or 3-day pass user;
Subscriber = Annual Member) Gender (Zero=unknown; 1=male; 2=female) Year of
Birth The data is divided monthly and starts in June of 2013 until March of
2020.
108rlui20@stuy.eduIt’s a collection of over 200,000 datasets of US government data that are
separated into 14 different topics.
https://www.data.gov/There are many, many different data sets and data tables that is relatively
easy to access, most of which are downloadable.
109jwang22@stuy.eduThe data I present is baseball stats of the 2019 season by team. The specific
stats are hitting stats of each team, for example batting average, number of
home runs, etc.
https://www.espn.com/mlb/stats/team/_/view/battingThe chart will be there as soon as you join the link. However, in order to
upload it as a CSV file, you have to do it from excel. You export data from the
upload from web option.
110ebuller20@stuy.eduThis dataset looks at the all the songs of each Spotify "All Out..." Decade
Playlists and looks at their attributes such as bpm or dancibility.
https://www.kaggle.com/cnic92/spotify-past-decades-songs-50s10s#1950.csvEach decade (50s to New 10s) is split into one csv file and there is no file
from the link that combines all of them. However, since all of the files are in
the same format. These attributes are measured by an API, which tries to
calculate things such as Beats Per Measure, Acousticness, and even how "happy"
a song is. There are many ranges associated to these attributes, but generally,
the larger a number is, the more it matches to the quality it is measuring.
More details about this API can be found here:
https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-f
eatures/
111ebuller20@stuy.eduIt takes the data of songs found in Spotify's "All out..." decade playlists and
lists the attributes found from Spotify's API.
https://www.kaggle.com/cnic92/spotify-past-decades-songs-50s10s/version/9While downloading this data is free, you need to make a Kaggle account (which
is also free) There are 7 separate data sheets (1 for each decade). If you
want to have a spreadsheet that includes all the decades, you will need to make
one yourself. It is easy since all the spreadsheets follow the same format. As
said previously, the attributes of each song are measured by an API Spotify has
that measures certain qualities of a song. These include Beats Per Minute,
whether it is acoustic or not, and even how happy a song is. They all have
different ranges of values, but usually the higher the number is, the more of a
certain quality it is. More information can be found here:
https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-f
eatures/ disregard my last submission, this one has more info.
112etorres20@stuy.eduThere is a plethora of data contained within this downloadable file: there are
16 rows about AIRBNB related data. There rows include listing name, host name,
price, neighborhood, and reviews.
https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data/data#AB_NYC_2019.csvThere are hundred and hundreds of listings and accessing the data is relatively
easy. Simply visit the link above to reach the Kaggle website: it did make me
login with google, however. When you reach the page, there is a box that says
"download" and the file is 7 mb plus it automatically downloads as a csv. The
csv has a few gaps in some rows because not all of the listings have received
reviews and ratings that are columns on the table.
113iwei20@stuy.eduThis dataset contains information on player reconnaissance in over 500
professional-level Starcraft games. From the perspective of one player (the
Terran), it contains information on how many enemy (Protoss) units the player
has observed, can observe, has seen destroyed, etc., along with an overall
measure of how much enemy territory the player can see.
https://www.kaggle.com/kinguistics/starcraft-scouting-the-enemyYou have to make a kaggle account to download it
114wwoodruff20@stuy.eduWins, loses, ties, and win percentage of all 32 teams of the NFL in the 2019
season as well as what division they play in.
https://www.pro-football-reference.comwhen you click on one of the teams it tells you extra information.
115cliu20@stuy.eduPlay-by-play (more recently, pitch-by-pitch) data for mostly every MLB game
since 1918.
https://www.retrosheet.org/game.htmThis might not be very interesting for those not into baseball. When you
download the data, it will appear as a .zip file. When you open the file, the
records will have unusual filename extension that actually refers to what's in
the file (specifically whether it's an event log or a roster and the league –
american or national). All of the files are in CSV
format. https://www.retrosheet.org/datause.txt and
https://www.retrosheet.org/eventfile.htm have a good explanation for what all
of the notation means and what should be expected for each line. For example,
"play,2,1,jeted001,11,BS1.X,63/G" means that in the bottom of the second
inning, Derek Jeter took a ball, then had a swinging strike, then the pitcher
threw a pickoff throw to first, then he hit a ground ball at the shortstop who
threw it to first base for a putout.
116gbae20@stuy.edu2018 Census Results about things such as ethic groups and hours worked per weekhttps://www.stats.govt.nz/assets/Uploads/2018-Census-totals-by-topic/Download-data/2018-ce
nsus-totals-by-topic-national-highlights-csv.zip
https://www.stats.govt.nz/large-datasets/csv-files-for-download/ (I found it on
this website and scrolled down to the first link in the Census area)
117hlin20@stuy.eduabortion rates sorted by age of womenhttps://www.stats.govt.nz/large-datasets/csv-files-for-download/you can scroll down and download this file, and open if by unzipping it. This
website also contains several other data and this file, and probably all the
others, have more than one data set.
118dmaheshwari20@stuy.eduThis collection of data is about the film industry, specifically the budget,
national and global box office sales, dates, titles, etc about specific movies,
franchises, or groupings of movies.
https://www.the-numbers.com/For our purposes, it is probably best to look at the box office, home video,
and movie categories to see domestic and international data. Some tables show
percent change along with the categories listed above and other useful
information in judging the success of a film.
119chan20@stuy.eduMean BMI of both male and female older than 18 from 2016 to 1983 in all of the
countries of the world by the WHO
https://www.who.int/data/gho/data/indicators/indicator-details/GHO/mean-bmi-(kg-m-)-(age-s
tandardized-estimate)
you are going to have to choose "export file" then under file format, you will
have the option to choose "CSV 30000 row" finally, you can just click export
and you will have all this juicy information
120jchen20@stuy.eduThis data is historical. It counts how cases of murder, rape, robbery, felony
assault, burglary, grand larceny, and grand larceny of a motor vehicle there
were in each year from 2000 to 2019
https://www1.nyc.gov/assets/nypd/downloads/pdf/analysis_and_planning/historical-crime-data
/seven-major-felony-offenses-2000-2019.pdf
It looks like a simple 9 by 21 table and I was able to download the csv file.
You can find the download link by searching up citywide crime statistics on NYC
open data, clicking SHTML, and clicking historical crime stats. Then, u
download the file for Citywide Seven Major Felony Offenses. Of course, I don't
completely trust crime statistics because arrests are never concrete; false
accusations happen a lot. In addition, these are only reported cases. Not
everyone is going to call the cops when they are assaulted.
121jlane20@stuy.eduIt is a record of all the stop, question, and frisks performed in NYC in 2019.
It contains massive amount of data about basically anything one could know
about that situation.
https://www1.nyc.gov/site/nypd/stats/reports-analysis/stopfrisk.pageI chose the link at the top, the 2019 excel file.