menu
Tatoeba
language
Register Inloggen
language Grunnegs
menu
Tatoeba

chevron_right Register

chevron_right Inloggen

Bloadern

chevron_right Let willekeurege zin zain

chevron_right Bloadern op toal

chevron_right Deur liesten bloadern

chevron_right Bloadern op label

chevron_right Deur audio bloadern

Gemainschop

chevron_right Muur

chevron_right Liest van ale leden

chevron_right Toalen van leden

chevron_right Moudertoalsprekers

search
clear
swap_horiz
search

Opmaarken

The data you will find here will NOT be useful unless you are coding a language tool or processing data.

If you simply want sentences that you can use to learn a language, check out the sentence lists. You can build your own, or view the ones that others have created. The lists can be downloaded and printed.

Algemaine informoatsie over bestanden

Many of the Japanese and English sentences are from the Tanaka Corpus, which belongs to the public domain.

Creative commons

These files are released under CC BY 2.0 FR.

Creative Commons License CC-BY

A part of our sentences are also available under CC0 1.0.

Creative Commons License CC0

Lisensies veur audio

The license covering an audio file is chosen by the contributor, and is indicated on the page that lists the audio files that he or she has contributed.

Nog vroagen?

If you have questions or requests, feel free to contact us. In general, we answer quickly.

Downloads

arrow_back

Custom exports

Sentence pairs

Use this tool to generate and download customized exports on demand.

translate Sentence pairs
Download all sentences in language A with translations in language B

Download all sentences in language A that are translated into language B, along with the translations.

Weekly exports

info The files provided below are updated every Saturday at 6:30 a.m. (UTC).

Zinnen

Bestaandsnoam

{{sentences | filename}}

Ale toalen
Allain zinnen ien t: Abgazie Adygees Afrihili Afrikoans Aino Aklanon Albanees Algeriens-Arabisch Amhoars Ancient Hebrew Arabisch Aragonees Assamees Assyrisch Neo-Aramees Asturisch Avar Awadhi Aymara Azerbeidzjoans Baarg-Mari Balinees Baluchi Bambara Bandjarees Basjkiers Baskisch Baybayanon Bayers Bengoals Berber Berom Bhojpuri Birmoans Bislama Bodo Boerjoatisch Bosnisch Bretons Brithenig Bulgoars Cayuga Cebuano Central Kanuri Central Kurdish (Soranî) Centroal Dusun Centroal-Huasteca-Nahuatl Centroalbikol Centroalmnong Chagatai Chamorro Chavacano Cherokee Chinook Jargon Choctaw Congo Swahili Cuyonon CycL Deens Divehi Drìnts Dungan Dutton World Speedwords Duuts Eastern Armenian Egyptisch-Arabisch Emilioans Engels Erromintxela Erzjoa Esperanto Esties Evenks Ewe Extremeens Fareus Fijisch Fijisch Hindoestani Fins Fraans Frais Friulisch Fuloa Ga Gagaoezisch Gakassisch Galizjoans Gan Sinees Garhwali Gegisch Georgisch Goedjaratisch Golf Arabisch Gotisch Graiks Gruinlaands Grunnegs Guadeloups Kreools Guarani Guerrero-Nahuatl Haitioans Kreools Hakka Hausa Hawaioans Hibbrais Hiligaynon Hindi Hitchiti Hmong Daw (Wit) Hmong Njua (Gruin) Ho Hongoars Hunsrik Iban Ido Iers Ieslaands Igbo Ilocano Indonezisch Ingrisch Interglossa Interlingua Interlingue Inuktitut Irakees Arabisch Isoan Italjoans Jakoets Jamaikoans Patois Japans Javanees Jeuds Babylonisch Aramees Jewish Palestinian Aramaic Jiddisj Jin Juhuri (Judeo-Tat) K'iche' Kabardian Kabylisch Kadazan (Kustgebied) Kalmoeks Kamba Kannada Kantonees Kapampangan Karakalpaks Karakhanid Karatsjaj-Balkoarisch Karelisch Kasjmiri Kasjoebisch Katalaans Kazachs Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Keuls Khalaj Khasi Khmer Kinyarwanda Kirgizisch Kiribati Kirundi Klassiek Sinees Klingon Koemoeks Komi-Permjoaks Komi-Zurjeens Konkani (Goan) Koreaans Kornisch Korsikaans Kotava Krim-Tatoars Kroatisch Kveens Kymrisch Láadan Ladinisch Ladino Lakota Lao Latain Leegduuts (Leegsaksisch) Letgoals Lets Libyan Arabic Liefs Ligurisch Limbörgs Lingala Lingua Franca Nova Litouws Loazisch Lojban Lombardisch Louisiana Kreools Lugandoa Lushootseed Luxembörgs Madoerees Mahasu Pahari Maithili Malagasi Malaisisch Malaisisch (Informeel) Malayalam Maltees Mambae Mandarien Sinees Mantjoe Manx Maori Mapuche Marathi Marokkoans-Arabisch Marshallees Mauritioans Mazedonisch Meitei Mi'kmaq Middelengels Middelfraans Middle Persian (Pahlavi) Min Nan Sinees Minangkabaus Mingreels Mirandees Mohawk Moksjoa Mon Mongolisch Mono (USA) Muskogee (Creek) Nagoa (Tangshang) Nahuatl Nande Nauroaans Navajo Neapolitan Nederlaands Nedersorbisch Nepalees Newari Ngeq Nigerioans Fulfulde Niueoans Njungoa Nogai Noordfrais Noordhaida Noordlevantiens Arabisch Noordmoluks Malais Noordsamisch Noors (Bokmål) Noors (Nynorsk) Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nuer Nuosu Nyanja O'odham Occitoans Odioa (Oriyoa) Oedmoerts Oeigoers Oekraiens Oerdoe Oezbeeks Ojibwe Okinawoans Oldaramees Oldengels Oldfraans Oldfrais Oldgraiks Oldnoors Oldpraissisch Oldrussisch Oldsaksisch Oldspoans Oldtörs Oldtupi Oppersorbisch Orizaba-Nahuatl Osmoans Ossetisch Palaus Pali Paltsisch Pangasinan Papiamìnts Pasjtoe Pennsylvania-Duuts Perzisch Phoenician Piemontees Pikardisch Pipil Plains Cree Pools Portugees Punjabi (Oost) Punjabi (West) Qashqai Quechua Quenya Rapa Nui Rendille Reto-Romoans Riffains Roemeens Roetheens Rohingya Romani Russisch Samogitisch Samooans Sango Sanskriet Santali Saraiki Sardiens Selterfrais Servisch Setswana Seychels Kreools Shona Shuswap Sicilioans Silesian Sindariens Sindhi Sinees Pidginengels Singalees Sjanghainees Skots Skots-Goals Slovaaks Sloveens Soemerisch Soendanees Somalisch South Levantine Arabic Southern Kurdish Southern Zaza (Dimli) Spoans Sranan Tongo Standard Moroccan Tamazight Svabisch Sveeds Svitzerduuts Swahili Swoazi Sylheti Syrisch Tachawit Tadzjieks Tagalog Tagol Murut Tahaggart Tamahaq Tahitioans Talossoans Talysjisch Tamil Tashelhit Tatar Telugu Temuoans Tetun Thai Tibetoans Tigre Tigrinya Tjechisch Tjetjeens Tjoektjisch Tjoevasjisch Toevoans Tok Pisin Tokelaus Toki Pona Tonga (Zambezi) Tongoans Törkmeens Törs Tsongoa Tumbuka Tuvaluoans Uab Meto Umbundu Urhobo Veniesjoans Vepsisch Vietnamees Volapük Võro Waaide-Mari Waray-Waray Wayuu Western Armenian Witrussisch Woals Wolof Xhosa Xiang Sinees Yoruba Yucatec Maya Zais Zaza Zoeloe Zuudaltai Zuudhaida Zuudsamisch Zuudsotho Zuudsubanen Onbekinde toal
Bestaandsomschrieven
Contains all the sentences in the selected language. Each sentence is associated with a unique id and an ISO 639-3 language code.
Fields and structure
Zin-ID [tab] Toal [tab] Tekst

Detailleerde zinnen

Bestaandsnoam

{{sentencesDetailed | filename}}

Ale toalen
Allain zinnen ien t: Abgazie Adygees Afrihili Afrikoans Aino Aklanon Albanees Algeriens-Arabisch Amhoars Ancient Hebrew Arabisch Aragonees Assamees Assyrisch Neo-Aramees Asturisch Avar Awadhi Aymara Azerbeidzjoans Baarg-Mari Balinees Baluchi Bambara Bandjarees Basjkiers Baskisch Baybayanon Bayers Bengoals Berber Berom Bhojpuri Birmoans Bislama Bodo Boerjoatisch Bosnisch Bretons Brithenig Bulgoars Cayuga Cebuano Central Kanuri Central Kurdish (Soranî) Centroal Dusun Centroal-Huasteca-Nahuatl Centroalbikol Centroalmnong Chagatai Chamorro Chavacano Cherokee Chinook Jargon Choctaw Congo Swahili Cuyonon CycL Deens Divehi Drìnts Dungan Dutton World Speedwords Duuts Eastern Armenian Egyptisch-Arabisch Emilioans Engels Erromintxela Erzjoa Esperanto Esties Evenks Ewe Extremeens Fareus Fijisch Fijisch Hindoestani Fins Fraans Frais Friulisch Fuloa Ga Gagaoezisch Gakassisch Galizjoans Gan Sinees Garhwali Gegisch Georgisch Goedjaratisch Golf Arabisch Gotisch Graiks Gruinlaands Grunnegs Guadeloups Kreools Guarani Guerrero-Nahuatl Haitioans Kreools Hakka Hausa Hawaioans Hibbrais Hiligaynon Hindi Hitchiti Hmong Daw (Wit) Hmong Njua (Gruin) Ho Hongoars Hunsrik Iban Ido Iers Ieslaands Igbo Ilocano Indonezisch Ingrisch Interglossa Interlingua Interlingue Inuktitut Irakees Arabisch Isoan Italjoans Jakoets Jamaikoans Patois Japans Javanees Jeuds Babylonisch Aramees Jewish Palestinian Aramaic Jiddisj Jin Juhuri (Judeo-Tat) K'iche' Kabardian Kabylisch Kadazan (Kustgebied) Kalmoeks Kamba Kannada Kantonees Kapampangan Karakalpaks Karakhanid Karatsjaj-Balkoarisch Karelisch Kasjmiri Kasjoebisch Katalaans Kazachs Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Keuls Khalaj Khasi Khmer Kinyarwanda Kirgizisch Kiribati Kirundi Klassiek Sinees Klingon Koemoeks Komi-Permjoaks Komi-Zurjeens Konkani (Goan) Koreaans Kornisch Korsikaans Kotava Krim-Tatoars Kroatisch Kveens Kymrisch Láadan Ladinisch Ladino Lakota Lao Latain Leegduuts (Leegsaksisch) Letgoals Lets Libyan Arabic Liefs Ligurisch Limbörgs Lingala Lingua Franca Nova Litouws Loazisch Lojban Lombardisch Louisiana Kreools Lugandoa Lushootseed Luxembörgs Madoerees Mahasu Pahari Maithili Malagasi Malaisisch Malaisisch (Informeel) Malayalam Maltees Mambae Mandarien Sinees Mantjoe Manx Maori Mapuche Marathi Marokkoans-Arabisch Marshallees Mauritioans Mazedonisch Meitei Mi'kmaq Middelengels Middelfraans Middle Persian (Pahlavi) Min Nan Sinees Minangkabaus Mingreels Mirandees Mohawk Moksjoa Mon Mongolisch Mono (USA) Muskogee (Creek) Nagoa (Tangshang) Nahuatl Nande Nauroaans Navajo Neapolitan Nederlaands Nedersorbisch Nepalees Newari Ngeq Nigerioans Fulfulde Niueoans Njungoa Nogai Noordfrais Noordhaida Noordlevantiens Arabisch Noordmoluks Malais Noordsamisch Noors (Bokmål) Noors (Nynorsk) Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nuer Nuosu Nyanja O'odham Occitoans Odioa (Oriyoa) Oedmoerts Oeigoers Oekraiens Oerdoe Oezbeeks Ojibwe Okinawoans Oldaramees Oldengels Oldfraans Oldfrais Oldgraiks Oldnoors Oldpraissisch Oldrussisch Oldsaksisch Oldspoans Oldtörs Oldtupi Oppersorbisch Orizaba-Nahuatl Osmoans Ossetisch Palaus Pali Paltsisch Pangasinan Papiamìnts Pasjtoe Pennsylvania-Duuts Perzisch Phoenician Piemontees Pikardisch Pipil Plains Cree Pools Portugees Punjabi (Oost) Punjabi (West) Qashqai Quechua Quenya Rapa Nui Rendille Reto-Romoans Riffains Roemeens Roetheens Rohingya Romani Russisch Samogitisch Samooans Sango Sanskriet Santali Saraiki Sardiens Selterfrais Servisch Setswana Seychels Kreools Shona Shuswap Sicilioans Silesian Sindariens Sindhi Sinees Pidginengels Singalees Sjanghainees Skots Skots-Goals Slovaaks Sloveens Soemerisch Soendanees Somalisch South Levantine Arabic Southern Kurdish Southern Zaza (Dimli) Spoans Sranan Tongo Standard Moroccan Tamazight Svabisch Sveeds Svitzerduuts Swahili Swoazi Sylheti Syrisch Tachawit Tadzjieks Tagalog Tagol Murut Tahaggart Tamahaq Tahitioans Talossoans Talysjisch Tamil Tashelhit Tatar Telugu Temuoans Tetun Thai Tibetoans Tigre Tigrinya Tjechisch Tjetjeens Tjoektjisch Tjoevasjisch Toevoans Tok Pisin Tokelaus Toki Pona Tonga (Zambezi) Tongoans Törkmeens Törs Tsongoa Tumbuka Tuvaluoans Uab Meto Umbundu Urhobo Veniesjoans Vepsisch Vietnamees Volapük Võro Waaide-Mari Waray-Waray Wayuu Western Armenian Witrussisch Woals Wolof Xhosa Xiang Sinees Yoruba Yucatec Maya Zais Zaza Zoeloe Zuudaltai Zuudhaida Zuudsamisch Zuudsotho Zuudsubanen Onbekinde toal
Bestaandsomschrieven
Contains additional fields for each sentence (owner name, date created/modified).
Fields and structure
Zin-ID [tab] Toal [tab] Tekst [tab] Gebroekersnoam [tab] Doatum touvougd [tab] Doatum veur lest wiezigd

Original and Translated Sentences

Bestaandsnoam
sentences_base.tar.bz2
Bestaandsomschrieven
Each sentence is listed as original or a translation of another. The "base" field can have the following values:
  • zero: The sentence is original, not a translation of another.
  • greater than zero: The id of the sentence from which it was translated.
  • \N: Unknown (rare).
Fields and structure
Zin-ID [tab] Base field

Zinnen (CC0)

Bestaandsnoam

{{sentencesCC0 | filename}}

Ale toalen
Allain zinnen ien t: Algeriens-Arabisch Ancient Hebrew Arabisch Bengoals Berber Deens Duuts Engels Esperanto Fins Fraans Hibbrais Hindi Ho Hongoars Ido Interlingua Italjoans Japans Jeuds Babylonisch Aramees Jewish Palestinian Aramaic Jiddisj Kabylisch Kantonees Karelisch Katalaans Klassiek Sinees Klingon Konkani (Goan) Kveens Kymrisch Láadan Ladino Latain Ligurisch Mandarien Sinees Middelengels Nederlaands Njungoa Noors (Bokmål) Oekraiens Oldaramees Oldfrais Oldgraiks Oldnoors Phoenician Pools Portugees Russisch Santali Spoans Standard Moroccan Tamazight Sveeds Sylheti Tachawit Tjechisch Toki Pona Volapük Witrussisch Onbekinde toal
Bestaandsomschrieven
Contains all the sentences available under CC0.
Fields and structure
Zin-ID [tab] Toal [tab] Tekst [tab] Doatum veur lest wiezigd

Hinwiezens

Bestaandsnoam
links.tar.bz2
Bestaandsomschrieven
Contains the links between the sentences. 1 [tab] 77 means that sentence #77 is the translation of sentence #1. The reciprocal link is also present, so the file will also contain a line that says 77 [tab] 1.
Fields and structure
Zin-ID [tab] Vertoalen-ID

Labels

Bestaandsnoam
tags.tar.bz2
Bestaandsomschrieven
Contains the list of tags associated with each sentence. 381279 [tab] proverb means that sentence #381279 has been assigned the "proverb" tag.
Fields and structure
Zin-ID [tab] Labelnoam

Liesten

Bestaandsnoam
user_lists.tar.bz2
Bestaandsomschrieven
Contains the list of sentence lists.
Fields and structure
Liest-ID [tab] Gebroekersnoam [tab] Doatum aanmoakt [tab] Doatum veur lest wiezigd [tab] Liestnoam [tab] Bewaarkboar deur

Zinnen ien liesten

Bestaandsnoam
sentences_in_lists.tar.bz2
Bestaandsomschrieven
Indicates the sentences that are contained by any lists. 13 [tab] 381279 means that sentence #381279 is contained by the list that has an id of 13.
Fields and structure
Liest-ID [tab] Zin-ID

Japanese indices

Bestaandsnoam
jpn_indices.tar.bz2
Bestaandsomschrieven
Contains the equivalent of the "B lines" in the Tanaka Corpus file distributed by Jim Breen. See this page for the format. Each entry is associated with a pair of Japanese/English sentences. Zin-ID refers to the id of the Japanese sentence. Betaiken-ID refers to the id of the English sentence.
Fields and structure
Zin-ID [tab] Betaiken-ID [tab] Tekst

Zinnen mit audio

Bestaandsnoam
sentences_with_audio.tar.bz2
Bestaandsomschrieven
Contains the ids of the sentences, in all languages, for which audio is available. Other fields indicate who recorded the audio, its license and a URL to attribute the author. If the license field is empty, you may not reuse the audio outside the Tatoeba project.
Downloading audio
A single sentence can have one or more audio, each from a different voice. To download a particular audio, use its audio id to compute the download URL. For example, to download the audio with the id 1234, the URL is https://tatoeba.org/audio/download/1234.
Fields and structure
Zin-ID [tab] Audio id [tab] Gebroekersnoam [tab] Lisensie [tab] Attribution URL

User skill level per language

Bestaandsnoam
user_languages.tar.bz2
Bestaandsomschrieven
Indicates the self-reported skill levels of members in individual languages.
Fields and structure
Toal [tab] Skill level [tab] Gebroekersnoam [tab] Details

Users' sentence reviews

Bestaandsnoam
users_sentences.csv
Bestaandsomschrieven
Contains sentences reviewed by users. The value of the review can be -1 (sentence not OK), 0 (undecided or unsure), or 1 (sentence OK). Warning: this data is still experimental.
Fields and structure
Gebroekersnoam [tab] Zin-ID [tab] Review [tab] Doatum touvougd [tab] Doatum veur lest wiezigd

Transkripsies

Bestaandsnoam

{{transcriptions | filename}}

Ale toalen
Allain zinnen ien t: Japans Kantonees Mandarien Sinees Oezbeeks
Bestaandsomschrieven
Contains all transcriptions in auxiliary or alternative scripts. A username associated with a transcription indicates the user who last reviewed and possibly modified it. A transcription without a username has not been marked as reviewed. The script name is defined according to the ISO 15924 standard.
Fields and structure
Zin-ID [tab] Toal [tab] Skriptnoam [tab] Gebroekersnoam [tab] Transkriptie