This repository contains automatic translation tools for the Flores200 languages using the NLLB model from Meta AI (https://github.com/facebookresearch/fairseq/tree/nllb).
This API allows you to translate text to 200 languages automatically, detect the source language of a text from 200 languages, and get toxicity in texts.
Get all the available translation languages.
Method: GET
A list with all the available languages using FLORES-200 code.
Automatic translation using NLLB model from Meta AI. Translate input_text in langs_out languages. Available languages: [all languages listed in /langs endpoint].
Method: POST
| Field | Type | Required | Description |
|---|---|---|---|
| input_text | string | Yes | The text to be translated. |
| langs_out | string | No | The languages to translate the input text to (comma separated). If not specified, it will translate to all languages. |
The result of the translation service, containing translated text in all the languages specified.
Get toxicity in texts without specifying the source language.
Method: POST
| Field | Type | Required | Description |
|---|---|---|---|
| input_text | string | Yes | The text to be checked for toxicity. |
The result of the toxicity detection service, containing toxic words in input_text.
Detect source language of a text from 200 languages.
Method: POST
| Field | Type | Required | Description |
|---|---|---|---|
| input_text | string | Yes | The text to detect the language of. |
The result of the language detection service, containing the detected language of the input text.
To learn how to use these tools, check out the /examples folder. There is also a frontend application built with Streamlit that consumes this API. To learn more, check out the following repository: https://github.com/rosasalberto/automatic_translation_frontend.
https://hub.docker.com/repository/docker/rosasalberto/translation-service/general
- Get image
docker pull rosasalberto/translation-service - Run Image
docker run --gpus all -p 8080:8080 rosasalberto/translation-service
Modify translation_langs in config.py to include the languages you want to be able to translate, using the Flores200 language codes.
- Install CUDA 11.6
- Install Microsoft C++ Build Tools if using Windows:
- Install python 3.7.2 https://www.python.org/downloads/release/python-372/ and add to PATH:
has to return Python 3.7.2
python --version - Upgrade pip
python -m pip install --upgrade pip - Install pipenv
pip install pipenv - Clone this repo
git clone https://github.com/rosasalberto/automatic_translation_server - Change directory and install the needed dependencies in a virtual environment and activate it
cd automatic_translation_server pipenv install --dev --python 3.7.2 pipenv shell
- Download Language Detection (LID) model from the provided link: https://tinyurl.com/nllblid218e and add id to the '/weights' folder
- Configure the server by modifying the
config.pyfile:- Modify
translation_langsto include the languages you want to be able to translate, using the Flores200 language codes. - Modify
lid_pathto the full path of the LID model. - Modify
path_toxicity_datato the full path to the toxicity vocab files.
- Modify
uvicorn server:app --reload- Set up application from https://github.com/rosasalberto/automatic_translation_frontend
To build a Docker image, you need to have Docker installed on your machine. If you don't have it already, you can install it by following the instructions on the Docker website: https://docs.docker.com/get-docker/
- Get nvidia image for Cuda 11.6
docker pull nvidia/cuda:11.6.2-base-ubuntu20.04 - Build docker Image
docker build -t translation-service . - Run Image
docker run --gpus all -p 8080:8080 translation-service - Optional: Upload your Image to the Docker Hub
-
In packages installation, if you experience problems related to the Python version, try the following command which forces pipenv to use a given version of Python:
pipenv install --dev --python 3.7.2
-
If you do not have python 3.7.2 on your system, you can:
- Install Python using your operating system's package manager. On Linux systems, you can use apt-get or yum, and on macOS you can use brew.
- Download the Python installer from the official Python website (https://www.python.org/) and run it to install Python on your system.
- Use a version manager such as pyenv or asdf to install and manage multiple versions of Python on your system.
Once you have a Python interpreter installed, you should be able to use pipenv to install the dependencies for your project.
If you experience any problem don't hesitate to contact: rosas.alberto.upc@gmail.com