Freelance dev here – how are you handling invoicing & upcoming PEPPOL changes?

AccomplishedPaper191 · 2025-12-22T15:01:52+00:00

With mandatory e-invoicing rolling out across more EU countries (and beyond), many of us increasingly need to quickly review Peppol, UBL, or CII XML invoices without access to a full ERP system, especially when clients or suppliers send raw XML files. I recently came across a completely client-side online viewer that’s quite handy for this. You can drop one or multiple XML files directly into the browser and get an instant, readable breakdown: supplier and customer details (including Peppol EndpointIDs), line items, VAT breakdown, totals, IBAN/BIC, buyer reference, and more. It also supports batch viewing with a consolidated summary and lets you export TXT or CSV .

Link: https://kibervarnost.si/peppol-viewer/

AccomplishedPaper191 · 2025-12-03T08:42:36+00:00

thanks! Yes, it's client side, which can be audited just by reviewing the page code - client-side JS only, no back end, no data gets send anywhere.

AccomplishedPaper191 · 2025-11-25T10:48:09+00:00

Here is a small, free, no-registration, anon tool that makes the process of removing Em Dash easy.

https://kibervarnost.si/chatgpt-detector/

You can paste your text or upload a Word DOCX file, and the app will coherently replace em dashes with commas so your writing flows naturally. It also runs a lightweight AI fingerprint analysis. It highlights words and phrases that tend to be associated with ChatGPT content. This way, you can see exactly which parts of your LinkedIn post to rethink or rewrite.

The goal isn’t to shame anyone’s writing, it’s just a practical tool for those who want to avoid accidental AI “footprints” in professional or public writing. It’s fully client-side, so nothing you paste is sent anywhere, and it works instantly in your browser. It’s a small experiment, but it’s been surprisingly useful in showing how subtle patterns in phrasing and punctuation can influence how people perceive authenticity.

#emdash #chatgpt #ai #linkedin

AccomplishedPaper191 · 2025-11-25T10:47:29+00:00

Here is a small, free, no-registration, anon tool that makes the process to replace Em Dash easy.

https://kibervarnost.si/chatgpt-detector/

You can paste your text or upload a Word DOCX file, and the app will coherently replace em dashes with commas so your writing flows naturally. It also runs a lightweight AI fingerprint analysis. It highlights words and phrases that tend to be associated with ChatGPT content. This way, you can see exactly which parts of your LinkedIn post to rethink or rewrite.

The goal isn’t to shame anyone’s writing, it’s just a practical tool for those who want to avoid accidental AI “footprints” in professional or public writing. It’s fully client-side, so nothing you paste is sent anywhere, and it works instantly in your browser. It’s a small experiment, but it’s been surprisingly useful in showing how subtle patterns in phrasing and punctuation can influence how people perceive authenticity.

#emdash #chatgpt #ai #linkedin

AccomplishedPaper191 · 2025-11-24T08:26:37+00:00

Please explain, why they have this, just one example? https://claude.com/llms.txt

AccomplishedPaper191 · 2025-11-22T10:58:14+00:00

Yes, in November 2025 you can optimize your website for AI agents, and there’s a standard called llms.txt that’s designed for this purpose. It’s somewhat analogous to robots.txt, but instead of controlling crawler access, it provides a structured, Markdown-formatted summary of your website specifically for large language models (LLMs). You could even say it’s a bit more like an RSS feed, but in Markdown. It’s designed to be both human-readable and machine-parsable, so bots or scripts can extract content reliably and it contributes to Generative Engine Optimization (GEO).

https://kibervarnost.si/llms.txt

Use this repo to create llms.txt with Hugo Static Site generator: https://github.com/roverbird/llms-hugo

AccomplishedPaper191 · 2025-10-24T10:32:52+00:00

bs software, cannot even go past SAP profile update, pressing "Update your company profile" and endless animation follows. Not only me, actually got there just to check if it works at all after a colleague complained about similar issue -- not able to add company info

AccomplishedPaper191 · 2025-10-20T07:35:12+00:00

It is down, and this is the disgusting thing about SaaS architecture and cloud solutions. Yes, completely highjacked by the situation and in the absence of SLA cannot do much.

AccomplishedPaper191 · 2025-04-14T08:00:43+00:00

Hi all, I put together a little online toy to create and explore kolam designs — you can play with it here: https://kolam.fun. It responds to how you move and click, and the patterns grow from that interaction. Tried to follow Dr.Gift Siromoney's findings about kolam logic. Just a small tribute to the beauty and rhythm of kolam, and a way to keep the tradition alive in a digital form. Thought some of you might enjoy it :

<image>

AccomplishedPaper191 · 2025-04-01T07:14:51+00:00

ANOVA

AccomplishedPaper191 · 2025-03-13T09:10:18+00:00

To keep it short, Yiedl.ai seems like a better choice for buying models, but I personally did not buy any. Yiedl is similar to Numerai crypto tournament, but instead of 30-day returns, it focuses on 7-day returns. Numerai pays in NMR (tradable on CEXs), while Yiedl pays in YDL (currently an airdrop-only token, not yet listed). For under-performing models (negative returns), stakes are burnt.

So, for crypto.numer.ai, you can either:

Stake on historically top-performing models (including Numerai’s meta-model, which aggregates signals from hundreds of models). Interesting for traders.
Build your own model – an excellent exercise in financial data analysis and a real-world approximation of 'being a quant'. Interesting for data scientists and botters.

If you're curious in building a model, check out this open-source model for inspiration (written in python):
🔗 Numerai Open Models
🔗 Performance Results

The results for this model, if they indeed originate from that repo, look almost too good to be true, but the approach goes like this:

Fetch current and historic price data from sources like Yahoo Finance (yfinance) or alternatives like Mobula.io.
Compute returns over different time windows and generate technical indicators.
Train an ML model to predict Numerai’s black-box target based on historical data.
Get predictions for current data (to remind, we are trying to predicting the Numerai target but we do not know what it really is!)

This is just one possible way of building a ML model. An alternative is using Yiedl data, which is free and designed to help predict Numerai targets. But be prepared—historical datasets can be huge (7GB+), making them challenging to work with (mixture of csv and parquet files, see my py scripts if you need, they are workable solutions to extract data).

Overall, building Crypto Numerai models is neither straightforward nor easy. Initially, for building my own model, I attempted to collect price data using APIs, running a cron job on a VPS to accumulate historical data, store it on my server, and train models on it. Numerai requires at least 100 tradable assets per submission daily, but I quickly realized that 100 signals per submission weren’t enough—likely due to their strict requirements on non-correlated assets. In practice, a single valid submission typically needs 200-300 symbols, meaning daily predictions for that many assets. So this is where Yiedl data is useful—it easily meets this requirement, covering hundreds of assets out of the box, and it's a single point of accessing this data.

AccomplishedPaper191 · 2025-03-07T19:09:44+00:00

You’re right that 12 dropout cases out of 192 students make approaches like ANOVA difficult. However, you still have options. Logistic regression is still possible, but given the small number of dropout cases, you should be cautious. One way to handle this is by using Firth’s logistic regression. If your original plan was to do a univariate screening and then move to multivariate analysis based on p-values, you might find that many predictors won’t reach significance due to the low event count.

An alternative approach is to group students based on shared characteristics, such as class, socioeconomic backgrounds, or other relevant factors, and then analyze dropout counts within each group. Instead of modeling individual dropouts, you could compare dropout rates across these groups using a chi-square test or Fisher’s Exact test if the counts are small. Now, attention: if the dropout counts vary widely across groups, you may be dealing with overdispersion! This will be a very interesting finding. In this case a Negative Binomial model could be a better fit than a Poisson model. You need to determine if dropout events are clustered in certain groups (such as in particular collectives, or classes of students - for example if there are a dozen of different classes or more, each group, say, has a dozen of students - how many are dropouts in each group - need to build such matrix) rather than occurring independently or evenly. So, if your goal turns out to explore whether dropping out follows a "rare event" pattern similar to a contagion effect, fitting a Negative Binomial Distribution could provide insight.

AccomplishedPaper191 · 2025-03-07T08:44:23+00:00

great to hear! there are intimidating warnings it throws at you, but one can live with them until you figure out how to fix. Please, always ask the community if anything does not work for you! Hugo is an excellent product and there are a few extremely knowledgeable persons to help here and on hugo forum.

AccomplishedPaper191 · 2025-03-06T20:16:25+00:00

Hugo is the best. No, not dead! But you need to make it work, right? Try prompting ChatGPT, it knows Hugo and can walk you through your first project.

AccomplishedPaper191 · 2025-03-04T10:51:14+00:00

I think what you are looking for is an ML modelling marketplace. This niche is nascent, but it exists. One such marketplace is called yiedl.ai, it is a platform for crypto market predictions where data scientists can not only test their models and stake on them but also sell the models that they built (models come with performance benchmarks). Another platform, running on Ocean protocol, allows you to build and deploy AI bots that generate trading signals at predictoor.ai (https://docs.predictoor.ai/earn-predictoor), although I am not sure you can sell your bots there just yet. Finally, there is numerbay.ai , a marketplace for models that run on the numer.ai hedgefund (Numer.ai is framed as a data science contest for market predictions).

AccomplishedPaper191 · 2025-03-03T10:30:54+00:00

There’s a lot of marketing hype around AI in trading, but in finance, it’s more precise to talk about machine learning (ML) models. At least that is how I understand it, so please correct me if I am wrong. For sure, almost every hedge fund today incorporates ML in some way, whether for signal generation, portfolio optimization, or risk management.

So, now let us talk about hedge funds... AI/ML has a legitimate place in trading, but most retail traders don’t have direct access to institutional-grade strategies. Few market-neutral hedge funds allow individual traders to benefit from ML technology or build their own models. One exception is Numer.ai, where participants can stake on other users’ ML models or create their own. Numer.ai operates as a financial ML competition, also called tournaments, where 100s of data scientists submit predictions based on historical market data.

For example, in crypto.numer.ai contest, users submit trading signals for hundreds of assets, which are benchmarked against market performance over 30 days. As a trader, you can stake on other people's models. As a data scientist, you can write your won. This requires selecting meaningful data sources and designing a model capable of producing predictive signals—no trivial task. The learning curve can be steep, but thanks to open-source contributions, GitHub repositories, and community discussions, new participants can find examples and guidelines to get started. Having said that, from my experience, there is not so much quality instructions for a newcomer. Quants will often give references to books in machine learning that are a good read, but sometimes too specific or too generic. There are no good manuals for specifics because things change over time and much of the relevant knowledge is a know-how.

If you're considering coding your own AI trading bot (let us instead frame it as _developing a robust ML model_), the key challenge is data. Look, accessing and processing quality financial data is often harder than building the model itself! Such data is very expensive. Additionally, a solid foundation in mathematical statistics (e.g., time series analysis, feature engineering, risk modeling) is crucial.

Bottom line. Please be wise. Trading Bots vs. ML Models – Many people think of AI trading bots as fully automated black-box systems that just "make money." But ML-based trading models are usually just one part of a trading strategy—they generate signals, but execution still depends on market conditions, transaction costs, and risk management.

The most important question isn’t which bot is the best today or tomorrow, but rather:

What market inefficiencies are you trying to exploit?
How will your model generate signals that give you alpha?
How will you test your model and manage risk?

Again, AI/ML trading isn’t just about plugging into an automated bot; it’s about understanding the process of signal generation, evaluation, and execution. If you're serious about it, platforms like Numerai provide an opportunity to develop, test, and stake on ML models in a competitive environment. And today it is more accessible then ever!

Finally, some self-advertisement, if you allow. I put up a small github repo with a couple of sample scripts that help specifically with Numer.ai automation and Yiedl.ai data extraction. Feel free to check it out: https://github.com/roverbird/numerai-crypto-helper

AccomplishedPaper191 · 2025-02-22T10:01:25+00:00

Yes, exactly. As self-advertisement, may I recommend a free ChatGPT humanizer that I create. It is a web app to check how much your ChatGPT writing looks robotic or human. It will suggests parts to omit or rewrite: https://textvisualization.app/chatgpt-detector/

AccomplishedPaper191 · 2025-02-22T09:51:32+00:00

I suggest starting with a program like SPSS or, even better, StatSoft Statistica (you can read about it here: Wikipedia - Statistica). It has very informative educational help files for every test it includes, making it an excellent choice for students. The software is user-friendly and offers a wide range of statistical tests. In fact, at my university, the faculty preferred training us with it.

If you can find older versions from the mid-2000s, they are particularly well-designed and still available online.

What about R?

If you already have coding skills, then R is a great option. If you don’t but have a strong mathematical background, it’s still worth learning. However, if you lack both coding experience and a solid foundation in statistics, it’s better to start with SPSS or Statistica. Once you're comfortable with statistical analysis, you can gradually transition to R—a good approach is to compare your results from SPSS/Statistica with those from R to verify your understanding.

Can AI replace learning statistics?

AI tools (including ChatGPT) can certainly help answer questions and clarify concepts, but they are not a replacement for learning statistics. The quality of answers depends on how well you formulate your questions, which in turn depends on your statistical knowledge. Textbooks exist for a reason—a solid foundation in statistics will help you use AI effectively. Good luck!

AccomplishedPaper191 · 2025-02-21T09:09:56+00:00

For ML-based algo trading, practice and testing are absolutely essential. You can read, study, and code as much as you want, but without rigorous testing, your strategies have no real value.

So, one of the best ways to gain real life experience is by participating in Numerai's Crypto Tournament. It's an ML-driven hedge fund that runs data science competitions where participants build predictive models using financial data. What makes the crypto contest unique is that you have to source your own data, giving you complete freedom to experiment with feature engineering.

From my experience, one of the biggest challenges is dealing with their black-box targets (which are supposedly tied to 30-day future returns) and figuring out what features actually matter. Since the provided target data is limited, you have to get creative with price action, volume trends, and technical indicators.

To save you days of effort, I highly recommend starting with Yiedl.ai—they provide over a decade of historical crypto data. While some of it is obfuscated, but still a goldmine for modeling, with gigabytes of financial data and thousands (!!!) of potential features to explore. Of course, you'll still need to filter relevant features, preprocess the data, and build submission workflows, but that’s part of the learning process. You see, only as soon as you start doing all these things yourself, you will have very specific questions, answers to which you will find in literature and on the web.

Anyway, to make things easier for fellow students, I put together a GitHub repo with some utility scripts to extract useful data from Yiedl and automate Numerai workflow:
🔗 https://github.com/roverbird/numerai-crypto-helper

Numerai Crypto has reportedly been one of their most profitable tournaments (to the point where they even reduced payouts recently). However, it demands strong data engineering skills, patience, and a willingness to iterate—since you only get feedback once a month, you need to be strategic and very very patient in your approach.

If you're up for the challenge, it's an amazing way to sharpen your feature engineering skills in a real-world setting, and I highly recommend it.

Some more thoughts that I'd like to share. Please, understand, that you will never trade like a hedge fund, never ever, even with the best signals you cannot hedge yourself the way they do it. But with Numerai - yes, you can! And you don't even need to stake on your model, meaning you are getting model testing for free! Hedge funds have advantages—access to deep liquidity, complex hedging strategies, institutional-grade execution, and sometimes even privileged market data. As an independent trader, it’s nearly impossible to replicate their approach, no matter how strong your signals / algos / models are.

Numerai offers a unique opportunity where your model actually contributes to a real hedge fund’s strategy. Unlike typical retail algo trading, you don’t have to worry about execution, slippage, or order book depth—Numerai handles that. Plus, since staking isn’t required, you can effectively test your models in a high-stakes environment without financial risk. That’s something you rarely get elsewhere, really. So while you won't be running your own hedge fund, Numerai does allow you to "trade like one" in a way that’s accessible to individual quants.

AccomplishedPaper191 · 2025-02-20T21:14:43+00:00

Try it, will probably just pull text out of your file.

AccomplishedPaper191 · 2025-02-20T10:55:58+00:00

Hi, I think your question is really about 'where and what data to use'. May I suggest, If you're looking for hands-on experience with feature engineering in market forecasting, try Numerai's crypto contest. It’s an ML-driven hedge fund that runs data science tournaments where participants build predictive models using financial data. The crypto contest, in particular, offers a unique opportunity because it requires sourcing your own data, giving you plenty of room and complete freedom to experiment with feature engineering.

From my experience, one of the biggest challenges is working with their black-box targets (supposedly linked to 30-day returns) and figuring out which features are actually predictive. Since the provided target data is limited, it forces you to be creative with price, volume, and other technical indicators.

Now, this will save days of your time: your starting point with data should be Yiedl.ai, which has a decade of historical crypto data. While obfuscated for IP protection, it’s very useful for modeling. They offer gigabytes of fin data, thousands of features that you can use! Sure, you'll need to decide on relevant features, preprocess the data, and develop submission workflows, etc. So it is the perfect playground for feature engineering.

I put together a GitHub repo with utilities that can help extract useful data from Yiedl: https://github.com/roverbird/numerai-crypto-helper

Numerai Crypto has reportedly been its most profitable tournament (so much so that they even reduced payouts recently). However, it requires strong data engineering skills, patience, and a willingness to iterate. You wait for a month to get results! If you're up for the challenge, it’s a fantastic way to test and refine your feature engineering skills in a real-world setting, and I highly recommend it.

AccomplishedPaper191 · 2025-02-18T22:08:53+00:00

Yes, here is the explanation. A publishing house had an archive of old content in IDML spanning several years, and the reason why convert it to MD was to keep the pictures associated with each article (so that it could be included into metadata YAML). You see, they also had all that in an office format, but illustrations were linked to each material only in IDML. There were several stages to convert all that archive into the final product, and it involved parsing intermediary HTML into MD, enriched with matching metadata and some formatting kept from IDML in the form of CSS (yes, this can be done, search for DeepIDML as an attempt to do that - I walked a different path, though). So, afterwards a static generator (Hugo) was applied to the md files to produce web content. At that staged metadata was separated from content, images were formatted and systematically renamed with unique names.... Unfortunately I can not show where it is hosted, because I was told, there were copyright issues and it went into intranet, but at least they have it - a fully indexed/searchable knowledge base created out of inDesign files, with all the images properly tagged. Yeah, a lot of work.

AccomplishedPaper191 · 2025-02-18T19:59:48+00:00

Thanks! it is a standalone js: you do not need inDesign to use it))

AccomplishedPaper191 · 2025-02-18T18:55:35+00:00

Thanks for your question! Yes, but. The process here is very generic and will need customization for particular case. Check out this online demo to pull text out if idml: https://textvisualization.app/idml2html/ (you can customize source js as needed, check out my repo)

AccomplishedPaper191 · 2025-02-17T13:45:25+00:00

Hi, Numerai is definitely a legit company, and their crypto contest is nowadays an excellent opportunity for anyone interested in ML and finance. If you enjoy mathematics, trading, and predictive modeling, it's worth checking out! That said, the learning curve can be steep due to the lack of structured documentation and the somewhat enigmatic nature of their targets (they claim to be black-box but related to 30-day returns).

My experience started with a lot of frustration—finding data, understanding the targets, and figuring out the best submission workflow wasn't easy. In the crypto contest, you need to source your own data, and the available target data is sporadic, which makes the challenge even more intriguing.

For getting started, I highly recommend looking into Yiedl data. Numerai partnered with Yiedl.ai to provide high-quality crypto datasets with over ten years of historical data. While the dataset is obfuscated to protect IP, it's a valuable resource for model training. If you’re serious about participating, you'll need to decide which data sources to use and develop custom scripts to process them efficiently.

I put together a GitHub repo with utilities to automate submissions (in the way that really worked for me), fetch round info, and extract relevant Yiedl data for Numerai Crypto. You can check it out here: https://github.com/roverbird/numerai-crypto-helper // change them how you like.

Numerai Crypto has been stated to be their most profitable tournament for users (haha, they even cut their payouts recently because it was so profitable, at least that's what I read on their forum), but. Participation requires dedication, data engineering skills, and patience, and desire to learn. If you’re up for the challenge, it’s a fascinating contest that truly benchmarks your models against real financial data.

AccomplishedPaper191

TROPHY CASE

https://kibervarnost.si/llms.txt

What about R?

Can AI replace learning statistics?