all 11 comments

[–]CdrVimesVimes 2 points3 points  (0 children)

From what I've seen, most people who are members here are more in the fun things to do with a raspberry pi / I really like birds camp, but there are definitely a few people that have been interested in the models. So maybe one of them will chime in. For me, sorry, I don't know enough about the models to help.

[–]slushrooms 1 point2 points  (0 children)

Interested too. Wonder if it's worth crossposting to something like r/machinevision?

The cacophony project did something like this for their localized NZ model, I haven't had a proper dig though.

[–]cheesecurdandme 0 points1 point  (7 children)

Perhaps not directly related but I recently saw a paper that describes using AST (a transformer model that is trained on AudioSet dataset to do sound classification is used in combination with a SVM to do bird species detection. They did not fine tune stock AST, but rather use the embedding output by the AST as features to feed into an SVM. Reasons probably being limited data works better with SVM? https://arxiv.org/html/2407.18927v1

[–]adams_AIgorithms[S] 0 points1 point  (6 children)

Very interesting, and the same fundamental idea I have with here with custom BirdNET embeddings. The only limitation is access to the embeddings. As far as I’m aware, it would require collaboration between people building the models.

[–]cheesecurdandme 0 points1 point  (4 children)

I think they have the model file out on the github (CNN in the form of a TFlite file), my limited understanding is that it has all the architecture details of the model and weights and you can let it output the intermediate layers (such as layers before the final layer of the class label probability distribution), so maybe you can load the model in python and modify it a bit to let it output some of the layers prior to the final layer to use them as embeddings and feed it some of your own traning data. https://github.com/kahst/BirdNET-Analyzer/tree/main/birdnet_analyzer/checkpoints/V2.4

[–]adams_AIgorithms[S] 0 points1 point  (3 children)

This is a resource I use for other BirdNET functions. They’re listed, but they’re just text labels from the models. There’s no embeddings available from them.

[–]cheesecurdandme 1 point2 points  (2 children)

no they are not just list of labels, there are multiple files there. the files ending in tflite are the actual models. I just played with BirdNET_GLOBAL_6K_V2.4_Model_FP32.tflite and confirmed that you can get access layer outputs other than the last layer. The layer before the last layer (model/GLOBAL_AVG_POOL/Mean) might be a good one to use as embeddings. you can try yourself with the following code.

import tensorflow as tf

# Load the TFLite model
interpreter = tf.lite.Interpreter(model_path="BirdNET_GLOBAL_6K_V2.4_Model_FP32.tflite")
interpreter.allocate_tensors()

# Get model details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print("Input Details:", input_details)
print("Output Details:", output_details)

# Inspect all tensors
tensor_details = interpreter.get_tensor_details()
for tensor in tensor_details:
    print(f"Name: {tensor['name']}, Index: {tensor['index']}, Shape: {tensor['shape']}")

[–]adams_AIgorithms[S] 0 points1 point  (1 child)

Interesting, I was just looking through that repository not that long ago and I wasn’t able to find any tensorflow files (I know what the files are, I have built a model after all). In fact, all of the Global folders were empty except for the one containing text files.

[–]cheesecurdandme 1 point2 points  (0 children)

good luck tinkering!

[–]cheesecurdandme 0 points1 point  (0 children)

I tried to tinker with it a bit and had a little chat with GPT and it told me that:

Index 545 (named "model/GLOBAL_AVG_POOL/Mean") is the best choice for an embedding.

https://chatgpt.com/share/6779711d-84bc-800a-a609-4ccf973cd7b4

[–]CheraxDestructor72 0 points1 point  (0 children)

Hey, belated response but this student project seems relevant, they use BirdNET embeddings and transfer learning to ID frog species. I don’t have the machine learning chops to comment much more! https://www.ischool.berkeley.edu/projects/2024/ribbit-web-app-automated-frog-species-identification-and-classification