menu
Tatoeba
language
Registrearje Oanmelde
language Frysk
menu
Tatoeba

chevron_right Registrearje

chevron_right Oanmelde

Blêdzje

chevron_right Show random sentence

chevron_right Blêdzje op taal

chevron_right Blêdzje op list

chevron_right Blêdzje op label

chevron_right Blêdzje op audio

Mienskip

chevron_right Muorre

chevron_right List mei alle leden

chevron_right Taal fan leden

chevron_right Sprekkers memmetaal

search
clear
swap_horiz
search

Note

The data you will find here will NOT be useful unless you are coding a language tool or processing data.

If you simply want sentences that you can use to learn a language, check out the sentence lists. You can build your own, or view the ones that others have created. The lists can be downloaded and printed.

General information about the files

Many of the Japanese and English sentences are from the Tanaka Corpus, which belongs to the public domain.

Creative Commons

These files are released under CC BY 2.0 FR.

Creative Commons License CC-BY

A part of our sentences are also available under CC0 1.0.

Creative Commons License CC0

Licenses covering audio

The license covering an audio file is chosen by the contributor, and is indicated on the page that lists the audio files that he or she has contributed.

Fragen?

If you have questions or requests, feel free to contact us. In general, we answer quickly.

Downloads

arrow_back

Custom exports

Sentence pairs

Use this tool to generate and download customized exports on demand.

translate Sentence pairs
Download all sentences in language A with translations in language B

Download all sentences in language A that are translated into language B, along with the translations.

Weekly exports

info The files provided below are updated every Saturday at 6:30 a.m. (UTC).

Sinnen

Filename

{{sentences | filename}}

Alle talen
Only sentences in: Abkhaz Adyghe Afrihily Afrikaansk Ainu Aklanon Albaneesk Ald Eastersk Slavysk Ald Prussiaansk Ald Sakson Ald Tupi Aldarameesk Aldfrânsk Aldfrysk Aldgryksk Aldhebriuwsk Aldingelsk Aldnoarsk Aldspaansk Aldturksk Algerynsk Arabysk Amharyksk Arabysk Aragoneesk Araukaansk Assameesk Assyriaansk Nij-Aramaysk Asturysk Avaarsk Awadhi Aymara Azerbeidzjaansk Balineesk Baloetsjysk Bambara Banjar Bashkir Baskysk Bavariaansk Baybayanon Bengaleesk Berbers Berom Bhojpuri Bislama Bodo Bosnysk Bretonsk Brithenig Bulgaarsk Burmeesk Buryat Cayuga Cebuano Central Bikol Central Kanuri Central Kurdish (Soranî) Central Mnong Chagatai Chamorro Chechen Cherokee Chinese Pidgin English Chinook-jargon Chukchi Congo Swahili Deensk Dhivehi Drintsk Dungansk Dútsk Dutton Wrâld Faasjewurden Eastern Armenian Egyptysk-Arabysk Ekstremeensk Emiliaansk Erromintxela Erzya Esperanto Estysk Evenki Ewe Fenetiaansk Fêreusk Fiji Hindi Fijiaansk Finsk Fjetnameesk Foenisysk Folapúk Frânsk Friuliaansk Frysk Ga Gagauz Galats Galysysk Gan Sineesk Garhwali Georgysk Gheg Albanian Gilberteesk Gotysk Grienlânsk Grinslânsk Gryksk Guadeloupean Creole French Guaranysk Guerrero Nahuatl Gujarati Gulf Arabic Haida Haïtiaansk Kreoalsk Hakka Sineesk Hausa Hawaïaansk Hebriuwsk Heechsorbysk Hiligaynon Hill Mari Hindi Hitchiti Hmong Daw (Wyt) Hmong Njua (Grien) Ho Hongaarsk Hunsrik Ibansk Ido Iersk Igbo Ilokano Ingelsk Interglossa Interlingua Interlingue Inûktitût Iraaksk Arabysk Isan Italjaansk Jamaican Patois Japansk Javaansk Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddysk Jin Chinese Juhuri (Judeo-Tat) K'iche' Kabardysk Kabyle Kalmyk Kamba Kanada Kantoneesk Kapampanjan Karachay-Balkar Karakalpak Karakhanid Kareliaansk Kasjmiri Kasjubiaansk Katalaansk Kazachsk Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Khakask Khasi Khmer Kinjarwanda Kirgizysk Kirundi Klingon Kölsch Komi-Zyrian Komy-Permjaaksk Konkani (Goan) Koreaansk Kornysk Korsikaansk Kotava Krimean Tatar Kroatysk Kumyk Kust-Kadazaansk Kuyonon Kwensk Láadan Ladin Ladino Lakota Lao Latgaalsk Latyn Laz Leechsorbysk Letsk Libyan Arabic Liguerysk Limburchsk Lingala Lingua Franca Nova Literêr Sineesk Litousk Livoniaansk Lojban Lombard Louisiana Kreoalsk Luganda Lúksemboarchsk Lushootseed Madureesk Mahasu Pahari Maithilysk Malagassysk Malayalam Malaysk (Fernakular) Maleisk Malteesk Mambae Manksk Mantsjoe Maori Marathi Marokkaansk Arabysk Marsjalleesk Masedoanysk Meadow Mari Meitei Mi’kmaq Midden Frânsk Midden Ingelsk Middle Persian (Pahlavi) Min Nan Sineesk Minangkabau Mingrelysk Mirandeesk Mohawk Moksja Mon Mongoalsk Mono (USA) Morisyen Muskogee (Creek) Naga (Tangshang) Nahuatl Nande Napolitaansk Nauruaansk Navajo Nederdútsk (Nedersaksysk) Nederlânsk Nepaleesk Newari Ngek Nigeriaansk Fulfulde Niueaansk Noard-Frysk Noardlik Levantynsk Arabysk Noardlik Moluks Malajaansk Noardlik Sami Nogai Noors - Bokmål Noors - Nynorsk Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nuer Nuosu Nyungar O'odham Odia (Oriya) Oekraynsk Oezbeeksk Ojibwe Okinawan Oksitaansk Orizaba Nahuatl Ossetysk Ottomaansk Turksk Palatynsk Dútsk Palauan Pali Pangasinaansk Papiamento Pashto Pensilvaansk Dútsk Perzysk Piedmonteesk Pikard Pipil Plains Cree Poalsk Portegeesk Pulaar Punjabi (East) Punjabi (Westersk) Qashqai Quechua Quenya Rapa-Nui Rendille Reto-Romaansk Roemeensk Rohingya Romanysk Russysk Rusynsk Samoaansk Samogisjaansk Sango Sanskryt Santali Saraiki Sardynsk Sealterfrysk Servysk Setswana Seychelloisk Kreoalsk Shona Shuswap Silezysk Sindarin Sindhi Sineesk-Mandarynsk Sinhala Sintraal Dusun Sintraal Huasteca Nahuatl Sisyljaansk Siuwsk Sjanghaineesk Sjavakano Skots Gaelik Skotsk Sloveensk Slowaaksk Somalysk South Levantine Arabic Southern Haida Southern Kurdish Southern Subanen Southern Zaza (Dimli) Spaansk Sranantongo Standert Marokkaanske Tamazight Súd Altai Súd Sotho Sûd-Samysk Sumeriaansk Sundaneesk Swabysk Swahily Swazysk Sweedsk Switsersk Dútsk SykL Sylheti Syrysk Tachawit Tagal Murut Tagalog Tahaggart Tamahaq Tahitiaansk Taisk Tajyksk Talossaansk Talysh Tamylsk Tarifit Tashelhit Tataarsk Telugu Temuaansk Tetunsk Tibetaansk Tigre Tigrinya Tjûvasj Toemboeka Tok Pisin Tokelauan Toki Pona Tonga (Zambezi) Tongaansk Tsjechysk Tsjinjanja Tsjoktaw Tsonga Turkmeensk Turksk Tuvaluaansk Tuviniaansk Uab Meto Udmurt Umbundu Urdu Urhobo Uyghur Vepsk Võro Walloonsk Waray Wayuu Welsk Western Armenian Wolof Wytrussysk Xhosa Xiang Sineesk Yakut Yndonesysk Yngrysk Yoruba Yslânsk Yucatec Maya Zaza Zulu Unbekende taal
Bestânsbeskriuwing
Contains all the sentences in the selected language. Each sentence is associated with a unique id and an ISO 639-3 language code.
Fields and structure
Sentence id [tab] Taal [tab] Tekst

Detailed Sentences

Filename

{{sentencesDetailed | filename}}

Alle talen
Only sentences in: Abkhaz Adyghe Afrihily Afrikaansk Ainu Aklanon Albaneesk Ald Eastersk Slavysk Ald Prussiaansk Ald Sakson Ald Tupi Aldarameesk Aldfrânsk Aldfrysk Aldgryksk Aldhebriuwsk Aldingelsk Aldnoarsk Aldspaansk Aldturksk Algerynsk Arabysk Amharyksk Arabysk Aragoneesk Araukaansk Assameesk Assyriaansk Nij-Aramaysk Asturysk Avaarsk Awadhi Aymara Azerbeidzjaansk Balineesk Baloetsjysk Bambara Banjar Bashkir Baskysk Bavariaansk Baybayanon Bengaleesk Berbers Berom Bhojpuri Bislama Bodo Bosnysk Bretonsk Brithenig Bulgaarsk Burmeesk Buryat Cayuga Cebuano Central Bikol Central Kanuri Central Kurdish (Soranî) Central Mnong Chagatai Chamorro Chechen Cherokee Chinese Pidgin English Chinook-jargon Chukchi Congo Swahili Deensk Dhivehi Drintsk Dungansk Dútsk Dutton Wrâld Faasjewurden Eastern Armenian Egyptysk-Arabysk Ekstremeensk Emiliaansk Erromintxela Erzya Esperanto Estysk Evenki Ewe Fenetiaansk Fêreusk Fiji Hindi Fijiaansk Finsk Fjetnameesk Foenisysk Folapúk Frânsk Friuliaansk Frysk Ga Gagauz Galats Galysysk Gan Sineesk Garhwali Georgysk Gheg Albanian Gilberteesk Gotysk Grienlânsk Grinslânsk Gryksk Guadeloupean Creole French Guaranysk Guerrero Nahuatl Gujarati Gulf Arabic Haida Haïtiaansk Kreoalsk Hakka Sineesk Hausa Hawaïaansk Hebriuwsk Heechsorbysk Hiligaynon Hill Mari Hindi Hitchiti Hmong Daw (Wyt) Hmong Njua (Grien) Ho Hongaarsk Hunsrik Ibansk Ido Iersk Igbo Ilokano Ingelsk Interglossa Interlingua Interlingue Inûktitût Iraaksk Arabysk Isan Italjaansk Jamaican Patois Japansk Javaansk Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddysk Jin Chinese Juhuri (Judeo-Tat) K'iche' Kabardysk Kabyle Kalmyk Kamba Kanada Kantoneesk Kapampanjan Karachay-Balkar Karakalpak Karakhanid Kareliaansk Kasjmiri Kasjubiaansk Katalaansk Kazachsk Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Khakask Khasi Khmer Kinjarwanda Kirgizysk Kirundi Klingon Kölsch Komi-Zyrian Komy-Permjaaksk Konkani (Goan) Koreaansk Kornysk Korsikaansk Kotava Krimean Tatar Kroatysk Kumyk Kust-Kadazaansk Kuyonon Kwensk Láadan Ladin Ladino Lakota Lao Latgaalsk Latyn Laz Leechsorbysk Letsk Libyan Arabic Liguerysk Limburchsk Lingala Lingua Franca Nova Literêr Sineesk Litousk Livoniaansk Lojban Lombard Louisiana Kreoalsk Luganda Lúksemboarchsk Lushootseed Madureesk Mahasu Pahari Maithilysk Malagassysk Malayalam Malaysk (Fernakular) Maleisk Malteesk Mambae Manksk Mantsjoe Maori Marathi Marokkaansk Arabysk Marsjalleesk Masedoanysk Meadow Mari Meitei Mi’kmaq Midden Frânsk Midden Ingelsk Middle Persian (Pahlavi) Min Nan Sineesk Minangkabau Mingrelysk Mirandeesk Mohawk Moksja Mon Mongoalsk Mono (USA) Morisyen Muskogee (Creek) Naga (Tangshang) Nahuatl Nande Napolitaansk Nauruaansk Navajo Nederdútsk (Nedersaksysk) Nederlânsk Nepaleesk Newari Ngek Nigeriaansk Fulfulde Niueaansk Noard-Frysk Noardlik Levantynsk Arabysk Noardlik Moluks Malajaansk Noardlik Sami Nogai Noors - Bokmål Noors - Nynorsk Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nuer Nuosu Nyungar O'odham Odia (Oriya) Oekraynsk Oezbeeksk Ojibwe Okinawan Oksitaansk Orizaba Nahuatl Ossetysk Ottomaansk Turksk Palatynsk Dútsk Palauan Pali Pangasinaansk Papiamento Pashto Pensilvaansk Dútsk Perzysk Piedmonteesk Pikard Pipil Plains Cree Poalsk Portegeesk Pulaar Punjabi (East) Punjabi (Westersk) Qashqai Quechua Quenya Rapa-Nui Rendille Reto-Romaansk Roemeensk Rohingya Romanysk Russysk Rusynsk Samoaansk Samogisjaansk Sango Sanskryt Santali Saraiki Sardynsk Sealterfrysk Servysk Setswana Seychelloisk Kreoalsk Shona Shuswap Silezysk Sindarin Sindhi Sineesk-Mandarynsk Sinhala Sintraal Dusun Sintraal Huasteca Nahuatl Sisyljaansk Siuwsk Sjanghaineesk Sjavakano Skots Gaelik Skotsk Sloveensk Slowaaksk Somalysk South Levantine Arabic Southern Haida Southern Kurdish Southern Subanen Southern Zaza (Dimli) Spaansk Sranantongo Standert Marokkaanske Tamazight Súd Altai Súd Sotho Sûd-Samysk Sumeriaansk Sundaneesk Swabysk Swahily Swazysk Sweedsk Switsersk Dútsk SykL Sylheti Syrysk Tachawit Tagal Murut Tagalog Tahaggart Tamahaq Tahitiaansk Taisk Tajyksk Talossaansk Talysh Tamylsk Tarifit Tashelhit Tataarsk Telugu Temuaansk Tetunsk Tibetaansk Tigre Tigrinya Tjûvasj Toemboeka Tok Pisin Tokelauan Toki Pona Tonga (Zambezi) Tongaansk Tsjechysk Tsjinjanja Tsjoktaw Tsonga Turkmeensk Turksk Tuvaluaansk Tuviniaansk Uab Meto Udmurt Umbundu Urdu Urhobo Uyghur Vepsk Võro Walloonsk Waray Wayuu Welsk Western Armenian Wolof Wytrussysk Xhosa Xiang Sineesk Yakut Yndonesysk Yngrysk Yoruba Yslânsk Yucatec Maya Zaza Zulu Unbekende taal
Bestânsbeskriuwing
Contains additional fields for each sentence (owner name, date created/modified).
Fields and structure
Sentence id [tab] Taal [tab] Tekst [tab] Brûkersnamme [tab] Date added [tab] Date last modified

Original and Translated Sentences

Filename
sentences_base.tar.bz2
Bestânsbeskriuwing
Each sentence is listed as original or a translation of another. The "base" field can have the following values:
  • zero: The sentence is original, not a translation of another.
  • greater than zero: The id of the sentence from which it was translated.
  • \N: Unknown (rare).
Fields and structure
Sentence id [tab] Base field

Sentences (CC0)

Filename

{{sentencesCC0 | filename}}

Alle talen
Only sentences in: Aldarameesk Aldfrysk Aldgryksk Aldhebriuwsk Aldnoarsk Algerynsk Arabysk Arabysk Bengaleesk Berbers Deensk Dútsk Esperanto Finsk Foenisysk Folapúk Frânsk Hebriuwsk Hindi Ho Hongaarsk Ido Ingelsk Interlingua Italjaansk Japansk Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddysk Kabyle Kantoneesk Kareliaansk Katalaansk Klingon Konkani (Goan) Kwensk Láadan Ladino Latyn Liguerysk Literêr Sineesk Midden Ingelsk Nederlânsk Noors - Bokmål Nyungar Oekraynsk Poalsk Portegeesk Russysk Santali Sineesk-Mandarynsk Spaansk Standert Marokkaanske Tamazight Sweedsk Sylheti Tachawit Toki Pona Tsjechysk Welsk Wytrussysk Unbekende taal
Bestânsbeskriuwing
Contains all the sentences available under CC0.
Fields and structure
Sentence id [tab] Taal [tab] Tekst [tab] Date last modified

Keppelings

Filename
links.tar.bz2
Bestânsbeskriuwing
Contains the links between the sentences. 1 [tab] 77 means that sentence #77 is the translation of sentence #1. The reciprocal link is also present, so the file will also contain a line that says 77 [tab] 1.
Fields and structure
Sentence id [tab] Translation id

Labels

Filename
tags.tar.bz2
Bestânsbeskriuwing
Contains the list of tags associated with each sentence. 381279 [tab] proverb means that sentence #381279 has been assigned the "proverb" tag.
Fields and structure
Sentence id [tab] Labelnamme

Listen

Filename
user_lists.tar.bz2
Bestânsbeskriuwing
Contains the list of sentence lists.
Fields and structure
List-id [tab] Brûkersnamme [tab] Date created [tab] Date last modified [tab] Listnamme [tab] Bewurkber troch

Sinnen yn listen

Filename
sentences_in_lists.tar.bz2
Bestânsbeskriuwing
Indicates the sentences that are contained by any lists. 13 [tab] 381279 means that sentence #381279 is contained by the list that has an id of 13.
Fields and structure
List-id [tab] Sentence id

Japanese indices

Filename
jpn_indices.tar.bz2
Bestânsbeskriuwing
Contains the equivalent of the "B lines" in the Tanaka Corpus file distributed by Jim Breen. See this page for the format. Each entry is associated with a pair of Japanese/English sentences. Sentence id refers to the id of the Japanese sentence. Meaning id refers to the id of the English sentence.
Fields and structure
Sentence id [tab] Meaning id [tab] Tekst

Sinnen mei audio

Filename
sentences_with_audio.tar.bz2
Bestânsbeskriuwing
Contains the ids of the sentences, in all languages, for which audio is available. Other fields indicate who recorded the audio, its license and a URL to attribute the author. If the license field is empty, you may not reuse the audio outside the Tatoeba project.
Downloading audio
A single sentence can have one or more audio, each from a different voice. To download a particular audio, use its audio id to compute the download URL. For example, to download the audio with the id 1234, the URL is https://tatoeba.org/audio/download/1234.
Fields and structure
Sentence id [tab] Audio id [tab] Brûkersnamme [tab] License [tab] Attribution URL

User skill level per language

Filename
user_languages.tar.bz2
Bestânsbeskriuwing
Indicates the self-reported skill levels of members in individual languages.
Fields and structure
Taal [tab] Feardichheidsnivo [tab] Brûkersnamme [tab] Details

Users' sentence reviews

Filename
users_sentences.csv
Bestânsbeskriuwing
Contains sentences reviewed by users. The value of the review can be -1 (sentence not OK), 0 (undecided or unsure), or 1 (sentence OK). Warning: this data is still experimental.
Fields and structure
Brûkersnamme [tab] Sentence id [tab] Review [tab] Date added [tab] Date last modified

Undertitels

Filename

{{transcriptions | filename}}

Alle talen
Only sentences in: Japansk Kantoneesk Oezbeeksk Sineesk-Mandarynsk
Bestânsbeskriuwing
Contains all transcriptions in auxiliary or alternative scripts. A username associated with a transcription indicates the user who last reviewed and possibly modified it. A transcription without a username has not been marked as reviewed. The script name is defined according to the ISO 15924 standard.
Fields and structure
Sentence id [tab] Taal [tab] Script name [tab] Brûkersnamme [tab] Transcription