Back to blog home

Pawsome Updates January '23

About DagsHub

DagsHub simplifies the process of building better models and managing unstructured data projects by consolidating data, code, experiments, and models in one place.


Table of Contents
    Share This Article

    This month you will find:

    🍪  CookieCutter MLOps template

    🚀  DDA Challenge

    🎤  MLOps Podcast

    💻  Dev Updates

    🍪  CookieCutter MLOps Template

    Cookiecutter MLOps is a production-focused ML project template inspired by a thought-provoking tweetstorm by Shreya Shankar about MLOps principles. It implements the first 5 principles (beginner level) of the tweetstorm, including pre-commit hooks, attaching a Git hash to each trained model, using a monorepo, versioning data, and setting data quality SLAs. Contributions are welcome for the other rules, and the template is available for use today.

    Read more here: https://dagshub.com/blog/cookiecutter-mlops-a-production-focused-project-template/

    🚀  DDA Challenge

    The DDA Challenge is a great opportunity to demonstrate your data engineering skills and explore the possibilities of Direct Data Access. Through this challenge, participants can learn more about the data engineering landscape and how to use DDA to make their data pipelines faster and more efficient. We can't wait to see what everyone creates with DDA!

    Read more here: https://dagshub.com/blog/dda-challenge/

    🎤 MLOps Podcast

    🎨  Stable Diffusion and generative models with David Marx

    In this episode, Dean spoke with David Marx, Distinguished Engineer at Stability AI. They dove into how David got into machine learning, open-source software, and Stability AI.

    Watch the recording on YouTube:

    Or listen with your favorite podcast app: https://pod.link/1565390757/episode/c737de458e87daebd14d84ed3f4ba998

    💻  Dev Updates

    Our R&D team worked hard this month and delivered a lot of updates, here are some of them.

    DDA improvements

    We added a new function called create_dataset, it creates a new repository on DagsHub and uploads an entire dataset to it with one line of code.

    from dagshub.upload import create_dataset
    
    create_dataset("my-awesome-dataset", "path/to/data", private=False)
    

    In addition, we improved the overall performance of all operations in DDA.

    Design Changes & upgrades

    Our designer Anna is working hard with us to redesign the website, with the goal to simplify it, improve UX, and make it more pleasant to the eye :)

    This month we show you the new and improved navbar & footer of the website.

    Navbar - before
    Navbar - after
    Footer - before
    Footer - after

    They also have a new, responsive version. Try it on your phone!

    The new navbar & footer also arrived to our docs, along with an improvement to the overall docs design!

    Deprecating “Reports”

    We have experimented with a feature called “Reports” that allowed for documenting parts of the project with a WYSIWYG markdown editor. We recently decided to end this experiment and depricate “Reports”, in an aim to keep DagsHub simple and make room for other things :)

    If you disagree, or share a point of view that we might be missing, please reach out to us via Discord.

    Model File Visualization

    We added visualization for model files powered by Netron!

    You can check it out here: https://dagshub.com/yonomitt/BetterSquirrelDetector/src/main/models/model3_finetune.pt

    Miscellaneous updates

    • Along with the walk-through video, we added a game to Label Studio loading screen. Now you can enjoy the waiting time even more!
    • We now support “custom tasks” in DagsHub annotations. This means users can construct their own Label Studio tasks, that can contain more than one file per task! This feature is still experimental, it is not documented because the flow is not finalized. If you’re interested in this please let us know!
    • We fixed many bugs! 🐞

    Thank you for reading this month's Pawesome Updates!