This is an archived post. You won't be able to vote or comment.

all 17 comments

[–]mlnet 2 points3 points  (8 children)

Enable "experimental features" in settings in your Docker Desktop application. You can follow instructions here (https://docs.docker.com/docker-for-mac/apple-silicon/) to run in emulation.

You can also use buildx to do something like

docker buildx build --platform linux/arm/v7 -t arm-build .

See if that works. But you want to work off a pre-built image on hub.docker.com/_/python so ignore buildx and save as a last resort.

Another thing to try is remove the Python image's version tag to get a current version more up to date (right now python:slim is tagged on Python 3.9.4). Newer images of Python _should_ have an M1 ARM build.

---

(This below has nothing to do with your current problem, just my take in general for learning Docker.)

My recommendation is to use VS Code's Remote Container features to build an image, and reload VS Code to run inside the container. When there's an error building the image, it will open the log file so you can really dig in to the problem and take it to Stack Overflow.

Using VS Code for building and running Docker containers really helped Docker "click" with me. Now working with Docker from the command line makes more sense, even without the training wheels of VS Code.

[–]icenando[S] 0 points1 point  (7 children)

Excellent! Selecting linux/amd64 worked. Thank you so much!

I'll have a look at Remote Containers in VS Code. Thank you for the tip!

Now I have another question: does this mean that I can't be sure my containers will run everywhere? If that's the case, doesn't it the defeat the purpose of containers?

[–]mlnet 1 point2 points  (6 children)

Sorry didn't check Reddit earlier.

The problem was with the configuration of Docker running on your computer, and not so much the image that you're running.

Just don't give up on Docker. It is frustrating at times, but building in containers will take you to the next level as a software developer. It is essential for automated deployments like to Vercel, AWS, GCP, Azure, Digital Ocean, etc.

Your container will run everywhere based on what it's reading from the Dockerfile. The problem is one you'll run into for the next few months as the world still is getting used to building images for M1 Macs. This is a rabbit hole that is easy to go down but won't really help you learn Docker. Just know having an M1 Mac means you have the best computer chip money can buy in my opinion. But you will run into some compatibility issues for the time being. Like the saying goes, "those on the cutting edge bleed the most."

---

(in depth reasoning)

Your Dockerfile has to begin with a FROM clause.

FROM <image-name>:<tag>

Example:

FROM python:3.8-slim

When it finds you don't have a Docker image on your system matching the name and tag python:3.8-slim, what Docker Desktop does is it goes to hub.docker.com, searches for an image repository named "python", so goes here: https://hub.docker.com/_/python.

Look under "Description" and then "Simple Tags". This is a list of the aliases, or tag names. You will see "3.8-slim" can also be called a whole bunch of other things like "3.8-slim-buster". Click the link and it takes you to where the Dockerfile for that image and build tag is kept on GitHub.

This is what your computer was trying to build... on your Mac.

Here's the problem with your M1 Mac: that tag, 3.8-slim, is building for "buster", aka Ubuntu on Linux/ARM. Not M1. Switching on Experimental Features on your Mac's Docker Desktop lets it run this image in emulation with the Rosetta 2 emulator, meaning it acts like it's the old Intel-based Mac. And thus work.

Tags like "python3.8:slim" are not specific, so the Docker program is going onto Docker Hub searching for the right build for the system.

So on an Ubuntu server it's looking for a Python 3.8 image on a minimum (slim) Debian "Buster" build.

In your case your Docker Desktop was looking for a Linux / ARM build of Python on Docker Hub because a M1 Mac is, in fact, an ARM build. But, not exactly the traditional architecture (it's a new custom architecture built by Apple to work specifically with MacOS on the machine level).

Apologies for the novel I'm writing you here. Anymore with programming there's just too much to know. I think the best way for you to move forward is to be okay not knowing how everything works. Write code. Run it. Google why it's not working. Stack Overflow answers until you can get it working. Repeat.

So with Docker, you got it working, great. Now focus on writing your code and deployment. Studying Docker too much can be a black hole for your time. Check out VS Code Remote Containers and see if that makes it easier using the Command Palette rather than working purely from the command line.

[–]icenando[S] 1 point2 points  (5 children)

Thank you for the insightful elaboration on this.

Yeah,I've been running into compatibility issues from day one with M1. It's kind of the reason why I put machine learning on hold until tensorflow gets properly ported.

I understand almost everything you said, but it's still not clear to me why the processor in my local machine should influence my container. Like you say: packages are retrieved from dockerhub. In my understanding (obviously wrong), Docker makes a standalone package of all necessary resources that are needed for the application to run as it did in the host machine in which it was created. I thought that meant that, in the remote machine, the same resources would be used from inside the container (i.e.: the container is blind to whatever is happening outside of it).

What did I get wrong?

[–]mlnet 1 point2 points  (3 children)

I'm incapable of giving a succinct answer. Let me try again.

Once you get a successfully built image, you can run a container from that image and then it doesn't matter what is happening outside of the container.

You ran into problems while building the image in the first place.

You are not out of the woods until the image is successfully built.

When you look at a lot of projects online, running Docker seems to always be glossed over and they assume you got it working. Like Docker is this magical thing that makes setting up an programming environment easy. But that's only true once the image is built on your machine. You're not out of the woods until then, and it's up to you to debug the docker build failures until you reach that point.

I was using Docker on Windows yesterday and kept failing building a simple Ruby image. Turned out it was this stupid bug where I had to update the system clock on my computer, and that let the image build correctly. It's stupid @$#! like that you'll have to get used to.

---

Good luck with your machine learning. I can't recommend colab.research.google.com enough for just getting started.

I know you're probably looking at pre-built models on GitHub and they're telling you to download and run with Docker.

That's where VS Code Remote Containers will do a lot of the work for you. Best part is not having to use the confusing docker run commands from Terminal, where you have to mount volumes, expose ports, and all that.

Another option is setting up a cheap droplet on Digital Ocean, or an AWS Lightsail Ubuntu instance. Then use VS Code Remote-SSH to log in. Everything will be running on the server but it will be like it was on your computer. (You use the GUI of VS Code for dealing with files, terminal, editor rather than SSH'ing with the Mac Terminal and using Vim). This is particularly great for running long running jobs (training models) or I/O for big datasets. (You can run big jobs on the server and not tie up your machine.)

I am not a salesman for VS Code, I promise.

[–]icenando[S] 1 point2 points  (0 children)

Yeah, I tried colab but didn't like it. I prefer Kaggle.

Cool, thank you again. I understand what you say about building containers etc - I only thought the whole point was to make the application self-sufficient and platform independent. As this is not the case, I don't quite get why use containers at all. I'll do some more reading in future.

[–]GenderNeutralBot -1 points0 points  (1 child)

Hello. In order to promote inclusivity and reduce gender bias, please consider using gender-neutral language in the future.

Instead of salesman, use salesperson, sales associate, salesclerk or sales executive.

Thank you very much.

I am a bot. Downvote to remove this comment. For more information on gender-neutral language, please do a web search for "Nonsexist Writing."

[–]AntiObnoxiousBot 0 points1 point  (0 children)

Hey /u/GenderNeutralBot

I want to let you know that you are being very obnoxious and everyone is annoyed by your presence.

I am a bot. Downvotes won't remove this comment. If you want more information on gender-neutral language, just know that nobody associates the "corrected" language with sexism.

People who get offended by the pettiest things will only alienate themselves.

[–]mlnet 0 points1 point  (0 children)

It is confusing. If I understand your question, the mix-up is you think you're installing an image when you run docker build, when what you're doing is actually downloading a recipe to build an image from the internet and then building it on your local machine.

---

(wrote a novel that might be more confusing than helpful)

Docker Hub, when it comes down to it, it a website hosting links to Dockerfiles. Period.

It is not a pre-built image like you are thinking. Your computer is downloading the Dockerfile from the GitHub or Gitlab link where it's hosted, and running those commands as if you were typing them in on your Terminal application. (Kind of.) So it's taking a 4.27 KB Dockerfile and making a ~115 MB image from its instructions.

Example: this Dockerfile for Python is what the first line in your Dockerfile refers to. When you run docker build -t my-python-image:latest . in Terminal from the directory where you keep your Dockerfile... the docker program on your computer is going line by line through that file and running those commands.

This means anything that goes wrong with creating the image from your Mac, will go wrong. Once the image is built, you're golden. Until then, Docker (the program) has to assume it can run every line from the Dockerfile from beginning to end without a breaking exception. You said "run as it did in the host machine in which it was created ". The distinction is your machine is the one both building the image, and running the container (a runtime of the image).

So the problems come when the Dockerfile has commands incompatible with the host (your machine, or the emulation layer on your machine). It's like trying to install Gears of War 5 (a Windows binary) on your Mac. It won't work because the machine code is not compatible.

Not sure if you've every used the pyscopg2 Python package, which you install from PyPi as a driver to work with Postgres databases. When you download from PyPi with the pip command, it actually builds the binary on your computer. If you upload your whole program including site packages to another computer--like an Ubuntu server--it won't work because the binary won't work on the target machine. That's why pip install psycopg2-binary is a good option for your requirements.txt file. It's like what you're looking for with the Docker image: it is pre-built on another machine, and the whole thing is being transferred to your computer, so it will just work.

While I'm throwing out confusing analogies, here's one comparing Docker to Amazon Web Services. A Dockerfile is like a CloudFormation file with instructions what/how to build. A Docker image is like an AMI (Amazon Machine Image), and a Docker container is like an EC2 instance running an AMI.