MLflow Tracking is an open-source API for live logging of parameters, metrics, and metadata when running a machine learning code. It provides valuable information and visibility that enables you to monitor the progress of the training process and take action if necessary. DAGsHub provides an MLflow Tracking server with each project so that you can log experiment information with MLflow, directly to the Experiment Tab.
How Does It Work?¶
When you create a repository on DAGsHub, a remote tracking server is automatically created and configured with the repository. The repository's MLflow tracking server will be located at:
The server endpoint can also be found under the ‘Remote’ button:
When you define DAGsHub's MLflow server as the remote server, the output of the run will be added to the Experiment Tab.
Only a repository contributor can log experiments.
How To Use It?¶
Install And Import MLflow¶
Start by installing the MLflow python package in your virtual environment using pip:
pip install mlflow
Then, you will import MLflow to our python module using
import mlflowand log the information with MLflow Logging Functions .
Set The MLflow Server URI¶
You can set the MLflow server URI by adding the following line to our code:
Set the MLflow server URI using an environment variable
You can also define your MLflow server URI using the
MLFLOW_TRACKING_URI environment variable.
We don't recommend this approach, since you might forget to reset the environment variable when switching between different projects. This might result in logging experiments to the wrong repository.
If you still prefer using the environment variable, we recommend setting it only for the current command, like the following:
MLFLOW_TRACKING_URI=https://dagshub.com/<username>/<repo>.mlflow python <file-name>.py
The DAGsHub MLflow server has built-in access controls. Only a repository contributor can log experiments
(someone who can
git push to the repository).
In order to use basic authentication with MLflow, you need to set the following environment variables:
MLFLOW_TRACKING_USERNAME- DAGsHub username
MLFLOW_TRACKING_PASSWORD- DAGsHub password or preferably an access token
export MLFLOW_TRACKING_USERNAME=<username/token> export MLFLOW_TRACKING_PASSWORD=<password>
Congratulations, you are ready to start logging experiments. Now, when you run your code, you will see new runs appear in the experiment tables, with their status and origin:
How To Use MLflow In A Colab Environment?¶
We shared two examples of experiment logging to DAGsHub’s MLflow server in a Colab environment.
When To Use It?¶
With MLflow Tracking, you can log your experiments by simply adding a few lines of code to our project. It’s a fast and easy solution to monitor the progress of our project. However, the downside of using MLflow is not having the ability to reproduce results easily. The experiment logs aren’t related to the state of the code and the data that produced them. Therefore, even when using MLflow Tracking, we recommend using Git tracking when achieving meaningful results that you might want to reproduce in the future.
Known Issues, Limitations & Restrictions¶
DAGsHub currently doesn't support artifacts, but we might soon. Please, contact us in our Discord channel if you find it important.