paint-brush
File processing in Clojure can easily become CPU boundby@atroche
2,303 reads
2,303 reads

File processing in Clojure can easily become CPU bound

by Alistair Roche4mMarch 1st, 2017
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

<a href="https://medium.com/@atroche/an-initial-exploration-of-wikireading-googles-huge-new-nlp-dataset-c17d859db9d0" target="_blank">Recently</a> I’ve been <a href="https://medium.com/@atroche/using-dataflow-in-clojure-to-process-googles-huge-new-wikireading-dataset-832af367539c" target="_blank">playing</a> with a big dataset called “<a href="https://github.com/dmorr-google/wiki-reading" target="_blank">WikiReading</a>” that researchers at Google have used in a <a href="https://arxiv.org/abs/1608.03542" target="_blank">couple</a> of <a href="https://arxiv.org/abs/1611.01839" target="_blank">new</a> NLP papers. It consists of the text of Wikipedia articles mapped to Wikidata statements (e.g. Australia → (deepest point, <a href="https://www.wikidata.org/wiki/Q179970" target="_blank">Lake Eyre</a>)), and adds up to 208GB of JSON.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - File processing in Clojure can easily become CPU bound
Alistair Roche HackerNoon profile picture
Alistair Roche

Alistair Roche

@atroche

L O A D I N G
. . . comments & more!

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite