Update README.md
parent
6f361fa122
commit
26bf6d4bb5
120
README.md
120
README.md
|
@ -12,7 +12,7 @@
|
|||
- Hiigh Precision / Recall
|
||||
- Worldwide Names
|
||||
|
||||
<!-- ## Installation
|
||||
## Installation
|
||||
```
|
||||
npm install name-dataset
|
||||
```
|
||||
|
@ -21,7 +21,7 @@ npm install name-dataset
|
|||
|
||||
```js
|
||||
const Names = require('name-dataset')
|
||||
``` -->
|
||||
```
|
||||
|
||||
```
|
||||
echo -e "$(python main.py 'Brian is in the kitchen while Amanda is watching the TV on the sofa.\nThey are both waiting for Alfred to come.')"
|
||||
|
@ -37,82 +37,44 @@ Here is an example on a (old) text: [ALI BABA AND THE FORTY THIEVES](http://text
|
|||
|
||||
<img src='assets/img_2.png'/>
|
||||
|
||||
## License
|
||||
## Dataset Generation
|
||||
|
||||
I don't own the data obviously. It's fetched from the websites listed in:
|
||||
[generate.sh](name-dataset/blob/master/generation/generate.sh)
|
||||
|
||||
https://github.com/philipperemy/name-dataset/blob/master/generation/generate.sh
|
||||
|
||||
So I guess the most strict software license should apply here.
|
||||
|
||||
## Sources and References
|
||||
|
||||
Exhaustive list of all the possible websites. Not all are used since there is a lot of garbage in the lists.
|
||||
|
||||
- Generator: http://listofrandomnames.com/index.cfm?generated
|
||||
- https://www.sajari.com/public-data: 5000 names (First Names CSV)
|
||||
- http://www.20000-names.com/ names around the world
|
||||
- https://catalogue.data.gov.bc.ca/dataset/most-popular-boys-names-for-the-past-100-years UK
|
||||
- https://catalogue.data.gov.bc.ca/dataset/most-popular-girl-names-for-the-past-100-years UK
|
||||
- https://www.nrscotland.gov.uk/statistics-and-data/statistics/statistics-by-theme/vital-events/names/babies-first-names/full-lists-of-babies-first-names-2010-to-2014 Scotland
|
||||
|
||||
- https://gender-api.com/en/pricing
|
||||
|
||||
- https://github.com/OpenGenderTracking/globalnamedata/tree/master/assets
|
||||
- From https://bocoup.com/blog/global-name-data
|
||||
|
||||
- https://github.com/MatthiasWinkelmann/firstname-database
|
||||
|
||||
- http://www.namepedia.org/en/firstname/Nabil/
|
||||
|
||||
- https://datasets.imdbws.com/
|
||||
- https://www.imdb.com/interfaces/
|
||||
|
||||
- https://opendata.stackexchange.com/questions/46/multinational-list-of-popular-first-names-and-surnames
|
||||
- ftp://ftp.heise.de/pub/ct/listings/0717-182.zip
|
||||
|
||||
- https://data.world/howarder/gender-by-name
|
||||
|
||||
- https://statbel.fgov.be/en/open-data/first-names-total-population-municipality
|
||||
|
||||
- https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/babynamesenglandandwales/previousReleases
|
||||
|
||||
- http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/corpora/names/
|
||||
|
||||
- https://www.ssa.gov/oact/babynames/limits.html
|
||||
|
||||
- https://www.ssa.gov/OACT/babynames/
|
||||
|
||||
- https://www.ssa.gov/cgi-bin/popularnames.cgi
|
||||
|
||||
- https://github.com/hadley/data-baby-names/blob/master/baby-names.csv
|
||||
|
||||
- http://www.quietaffiliate.com/free-first-name-and-last-name-databases-csv-and-sql/
|
||||
|
||||
- https://stackoverflow.com/questions/1452003/plain-computer-parseable-lists-of-common-first-names
|
||||
|
||||
- http://mbejda.github.io/
|
||||
|
||||
- https://www2.census.gov/topics/genealogy/1990surnames/dist.all.last
|
||||
|
||||
- https://opendata.stackexchange.com/questions/1108/database-of-names-of-japanese-and-non-japanese-people
|
||||
|
||||
- https://opendata.stackexchange.com/questions/12234/name-and-gender-dataset
|
||||
|
||||
- https://opendata.stackexchange.com/questions/7071/people-names-by-country
|
||||
|
||||
- http://www.randomnames.com/all-boys-names.asp
|
||||
|
||||
- https://en.wikipedia.org/wiki/List_of_most_popular_given_names#cite_note-ahram2004-2
|
||||
|
||||
- http://www.avss.ucsb.edu/NameFema.HTM
|
||||
|
||||
- http://www.oxfordreference.com/view/10.1093/acref/9780198610601.001.0001/acref-9780198610601?btog=chap&hide=true&page=248&pageSize=10&skipEditions=true&sort=titlesort&source=%2F10.1093%2Facref%2F9780198610601.001.0001%2Facref-9780198610601
|
||||
|
||||
- https://github.com/dominictarr/random-name/blob/master/first-names.txt
|
||||
|
||||
- https://github.com/smashew/NameDatabases/tree/master/NamesDatabases/first%20names
|
||||
|
||||
- https://www.behindthename.com/names
|
||||
|
||||
- https://incompetech.com/named/multi.pl
|
||||
- [listofrandomnames.com](http://listofrandomnames.com/index.cfm?generated)
|
||||
- [sajari.com 5000 Names around the Globe](https://www.sajari.com/public-data)
|
||||
- [20000-names.com](http://www.20000-names.com)
|
||||
- [UK Gov Boys Names 100 Years](https://catalogue.data.gov.bc.ca/dataset/most-popular-boys-names-for-the-past-100-years)
|
||||
- [UK Gov Girls Names Years](https://catalogue.data.gov.bc.ca/dataset/most-popular-girl-names-for-the-past-100-years)
|
||||
- [Scotland Baby Names](https://www.nrscotland.gov.uk/statistics-and-data/statistics/statistics-by-theme/vital-events/names/babies-first-names/full-lists-of-babies-first-names-2010-to-2014)
|
||||
- [Open Gender Tracking](https://github.com/OpenGenderTracking/globalnamedata/tree/master/assets)
|
||||
- [bocoup.com Global Names](https://bocoup.com/blog/global-name-data)
|
||||
- [MatthiasWinkelmann's Repo](https://github.com/MatthiasWinkelmann/firstname-database)
|
||||
- [Namepedia](http://www.namepedia.org/en/firstname/Nabil)
|
||||
- [Imdb Datasets](https://datasets.imdbws.com)
|
||||
- [Imdb Interfaces](https://www.imdb.com/interfaces)
|
||||
- [Stackenchange OpenData](https://opendata.stackexchange.com/questions/46/multinational-list-of-popular-first-names-and-surnames)
|
||||
- [hiese.de Listings](ftp://ftp.heise.de/pub/ct/listings/0717-182.zip)
|
||||
- [Data World](https://data.world/howarder/gender-by-name)
|
||||
- [Belgium Gov](https://statbel.fgov.be/en/open-data/first-names-total-population-municipality)
|
||||
- [UK Gov Birth](https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/babynamesenglandandwales/previousReleases)
|
||||
- [CMU AI Repo Corpora](http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/corpora/names)
|
||||
- [US Social Security Data Baby Names I](https://www.ssa.gov/oact/babynames/limits.html)
|
||||
- [US Social Security Data Baby Names II](https://www.ssa.gov/OACT/babynames/)
|
||||
- [US Social Security Data Popular Names](https://www.ssa.gov/cgi-bin/popularnames.cgi)
|
||||
- [Hadley Repo Baby Names](https://github.com/hadley/data-baby-names/blob/master/baby-names.csv)
|
||||
- [QuietAffiliate.com](http://www.quietaffiliate.com/free-first-name-and-last-name-databases-csv-and-sql)
|
||||
- [Stackoverflow](https://stackoverflow.com/questions/1452003/plain-computer-parseable-lists-of-common-first-names)
|
||||
- [Mbejda Repo](http://mbejda.github.io)
|
||||
- [US Gov Cencus](https://www2.census.gov/topics/genealogy/1990surnames/dist.all.last)
|
||||
- [Stackexchange Opendata Japanese](https://opendata.stackexchange.com/questions/1108/database-of-names-of-japanese-and-non-japanese-people)
|
||||
- [Stackexchange Opendata Gender](https://opendata.stackexchange.com/questions/12234/name-and-gender-dataset)
|
||||
- [Stackexchange Opendata Country](https://opendata.stackexchange.com/questions/7071/people-names-by-country)
|
||||
- [Randomnames.com Boys](http://www.randomnames.com/all-boys-names.asp)
|
||||
- [Wikipedia Popular Names](https://en.wikipedia.org/wiki/List_of_most_popular_given_names#cite_note-ahram2004-2)
|
||||
- [USCS Female Names](http://www.avss.ucsb.edu/NameFema.HTM)
|
||||
- [Oxford Reference](http://www.oxfordreference.com/view/10.1093/acref/9780198610601.001.0001/acref-9780198610601?btog=chap&hide=true&page=248&pageSize=10&skipEditions=true&sort=titlesort&source=%2F10.1093%2Facref%2F9780198610601.001.0001%2Facref-9780198610601)
|
||||
- [dominctarr Repo](https://github.com/dominictarr/random-name/blob/master/first-names.txt)
|
||||
- [smashew Repo](https://github.com/smashew/NameDatabases/tree/master/NamesDatabases/first%20names)
|
||||
- [Behind The Name](https://www.behindthename.com/names)
|
||||
- [Incompetech](https://incompetech.com/named/multi.pl)
|
||||
|
|
Loading…
Reference in New Issue