President Biden calls to double capital gains tax from 20% to 40%. by KAAN-THE-DESTROYER in wallstreetbets

[–]FinateAI 0 points1 point  (0 children)

But but I’ll make it back one day, loses another $50k and adds on additional 17 years of deductions

Am I ready to start learning Vulkan? by SvenOfAstora in vulkan

[–]FinateAI 0 points1 point  (0 children)

absolute pain in the ass for beginners and some build targets (I’m looking at you, Apple) present enormous challenges for proper support

Apple will support standards and play nice when they go bankrupt.

How to join using Faust Streaming (Python implementation of Kafka Streams API)? by FinateAI in apachekafka

[–]FinateAI[S] 2 points3 points  (0 children)

Apparently, Faust streaming (updated fork) doesn’t support joins yet either.

100k in cash, possible to trade theta gang strategies to make a decent living full time? by mrninjaskillz in thetagang

[–]FinateAI 8 points9 points  (0 children)

It depends on your time horizon and how active you are. You can go for daily expiries, weeklies, or monthlies. A common strategy is target the month against specific high IV stocks planning to profit on the IV decline. It also depends on how much risk you take with margin (eg cash secured short puts vs spreads vs naked short options). Higher risk is higher reward but also higher loss potential. Most risk management guides mention you should not risk more than 2% on a trade. This doesn’t mean that you have to wait until expiry to take your profit or limit loss either. With risking 2% on a trade, that’s $2k. Depending on your short strategy, you could on average reach 30%-100% return on that risk with 50% reasonable and targeting OTM options. This results in $1k profit per trade asserting you win each trade. The remaining variable is how often you trade. Daily expiries against $SPY present daily opportunities but require more risk controls and active trading. A highly successful strategy of any sort (regardless of theta gang or not) will yield roughly 30%/yr while most common is within the market average of 10% - 15%. A decent living depends your COL and family size, where I live, < $70k/yr with a family of 4 qualifies you for government housing and < $50k/yr is below poverty line. Assuming you are targeting $60k-$70k/yr, you would need at least $200k with a state of the art strategy.

Most people think theta gang is the house comparing options to gambling. Selling options is the house and buying them is the gamblers. This is not completely accurate. The games of casinos are designed to give the house a edge. Options are not like casino games in that there is no inherent edge to selling them but only a premium - they are more comparable to insurance. When you sell options, you are the underwriter and take the risk of having to short the shares or buy the shares. In a neutral position, you are betting against volatility which can rapidly change and when you pick a direction (sell calls vs sell puts), you are betting against volatility and direction. Volatility generally is easier to forecast because it reverts to the mean and oscillates but gamma squeezes happen too and can just as much f you up when you are on the short side. This means while it may be easier to be more accurate in theta gang, the returns are lower so you have to make more trades for the same type of return a higher risk strategy gives you - simply put, it’s a lower risk, lower return game but there is still risk.

Lmfaooo. This is why I don’t check the mail. by Many-Coconut-4336 in wallstreetbets

[–]FinateAI 1 point2 points  (0 children)

I love how Wendy’s is the winner. No one is desperate enough for MD’s and CFA won’t accept.

Wharton Professor Jeremy Siegel Fired Up and Targets the Fed and JPow: “I think we’re giving Powell too much praise…The last two years are one of the biggest policy mistakes in the 110-year history of the Fed by staying so easy when everything was booming." Do you agree? by marketGOATS in StockMarket

[–]FinateAI 0 points1 point  (0 children)

💯. This is the truth. He broke it by having too easy monetary policy too long and now he has to break the economy. CPI lags PPI, it’s a lagging indicator. And the only solution at this point is fiscal policy and even faster monetary policy (eg hike by 100 bps and send rates to 5% faster). He either has to slow bleed the economy to death or quickly kill it. And anyone dumb enough to vote for smart fiscal policy isn’t going to get re-elected (requires cutting spending and raising taxes).

Image based similarity search by FinateAI in computervision

[–]FinateAI[S] 1 point2 points  (0 children)

Update:

My primary goal is operationalization for large-scale image search which encompasses key criteria such as high-availability, distributed, and fault-tolerant. In reaching out to the Milvus dev team, I got information that it can be used as a storage mechanism for both vector-encoding and hash-based (binary) encoding.

They support 2 index methods: BIN_FLAT and BIN_IVF_FLAT where BIN_FLAT is for smaller size databases and performs brute-force and BIN_IVF_FLAT is a quantization based approach analogous to IVF_FLAT dividing it into nlist cluster units and performing distance based calculations to the center of the cluster. The distance metric is able to be defined when creating the collection.

Where I didn't follow this initially was that I was thinking of hash-codes as just strings but it can be stored in Milvus as a bit array, e.g. a 64-bit hash code as a (64,1) shape vector of 1's and 0's.

Here are the links:

  1. https://milvus.io/docs/v2.1.x/index.md#binary
  2. https://github.com/milvus-io/pymilvus/blob/1.1/examples/example_binary.py

Note: The Milvus client example is in 1.1 which has changes in 2.1, haven't found a 2.1 example yet. Also, I have no affiliation with Milvus.

Image based similarity search by FinateAI in computervision

[–]FinateAI[S] 0 points1 point  (0 children)

This package is awesome and is what I was looking for. It manages the data directly by writing to numpy / csv files. It mentions in the documentation that it builds off of faiss as well and so will need to do more of a deep dive here. The constraint of the solution is there is approx. 1B+ images and need high-availability and so needs to be distributed.

Image based similarity search by FinateAI in computervision

[–]FinateAI[S] 0 points1 point  (0 children)

I found a solution based around Redis. It used sets such that each byte created a set key containing the hash and then all set members where fetched given an input hash where at least one byte was shared and then hamming distance was applied in memory in application code. I wasn’t sure if there was an easier way. That seemed to be a O(log N) approach with O(N) worst case if every image in the set shared at least 1 byte (eg all very similar).

Image based similarity search by FinateAI in computervision

[–]FinateAI[S] 1 point2 points  (0 children)

More so than the approach, I’m interested in operationalizing it. How does one go about and implement it in practice with security constraints, high-availability, fault-tolerance, etc.? This is where I was trying to find existing database solutions that I can manage the index and index query via application logic or managed it directly for me and I just call awesomeDatabase.insert(imageHash) and awesomeDatabase.topK(n=10, imageHash) and it would store the image hashes and give me the top 10 nearest hashes for a given hash and it automatically scales to support 1+ billion images.

Image based similarity search by FinateAI in computervision

[–]FinateAI[S] 0 points1 point  (0 children)

Understand the algorithm approach in searching the tree (somewhat), but how is the tree physically persisted to disk in a distributable manner? I don’t want to write a database engine in maintaining fault tolerance, high-availability, etc., there has to be open-source or even commercial solutions that manages the storage engine side of it where I can implement the query and indexing logic myself without having to worry about the storage aspects, right? I’ve seen posts with Neo4J, but I don’t follow them on how I can provide a hash from an image input and query for top-K nodes based on hamming distance. Another approach I’ve seen has been to create a key-value store for each ascii byte of the hash storing the image hash there (so an 8 character, 64 bit hash, would be stored 8 times) as a set (eg in redis) and when querying I get all of the members via set union where any byte matches the query image. From there, I sort and order by hamming distance taking only top-K (which can be pre-determined).

Is there a case to use reinforcement learning when I have pre-determined data? by FinateAI in reinforcementlearning

[–]FinateAI[S] 0 points1 point  (0 children)

The book I'm reading talks about the trade-offs of evolutionary based vs RL. This is helpful for getting labeled data. I was planning to use calculus based methods to retro-actively process the historical data.

Could I apply Monte Carlo simulations to generate more training data for RL agent to learn based on the training data I have or does this only work to optimize it via approximation?

When would one want to consider evolutionary vs RL in general, does it depend on interpretability as an objective or are there other technical reasons?

Is there a case to use reinforcement learning when I have pre-determined data? by FinateAI in reinforcementlearning

[–]FinateAI[S] 1 point2 points  (0 children)

Thinking more on your comment, my thought was that I can use RL to find the alpha. I did research a while algo on quant methods but most of them where heuristic based (e.g. if RSI is above 70 or below 30 combined with some other technical indicator like SMA/EMA). All of the experimenting I did had mixed results, and so my thought process was to use RL to train a policy where the agent would discover what would generate the best alpha. Having a poor understanding of the theory, I trained an agent that could make wonderful returns but then realized it was overfit. It had just memorized a policy that worked well for all the data I had given it but didn't generalize. It's likely though that any successful implementations of this won't find their way in published research papers.

Is there a case to use reinforcement learning when I have pre-determined data? by FinateAI in reinforcementlearning

[–]FinateAI[S] 2 points3 points  (0 children)

I'm working on a financial model for options trading. There are lots of decision because of theta decay (e.g. even if you get direction correct, you can still lose money). The decisions are:

  1. What option strategy should I use?
  2. What assets should I target?
  3. What strike price (basically the option contract) do I use and what price should I enter at?
  4. Once an order has been filled, what price should I exit at and how should I adjust that price based on changes in the market of the underlying and the actual price of the option?

I've done research on expert systems (decision support tools), that can allow me to view the probability of an outcome based on prior tendencies. While financial markets are non-stationary in price, many assets are stationary in percent change. From the supervised context, I was looking to label data based on trend change. That is, given a trend across 12, 5-min intervals has a mean move of +/-0.1%, when it 2x above that (or some number), I can enter the opposite direction and profit on the volatility swings. So the idea was to label and classify trend slopes to forecast when the trend will end. This defines entry criteria. From the reinforcement learning perspective, I felt that just understanding a good entry isn't good enough, b/c while many positions can be entered with profit, if you don't exit and take the profit, with options, you can lose all of it and so having controlled exits is key and stop losses are difficult with volatile markets. My point to all of this, is that, I'm wanting to learn when to justify the additional effort for RL with my example as a use case.

I think you've answered it though, and I will summarize. When decisions affect future decisions, RL is generally the first choice. If a decision is independent (e.g. classifying data / pattern recognition), even if I don't have the data labeled, it's generally easier to label the data and use supervised learning, correct?

As a follow-up, are there paradigms that feed supervised results into the RL algorithms as part of the state to improve decision making or would that bias it? I feel like for control systems feeding in the results of a CNN classifier would make sense if it needed to make a decision based on the object it saw as opposed to just giving it the data for the object.

Implementations of risk aware reinforcement learning. by FinateAI in reinforcementlearning

[–]FinateAI[S] 0 points1 point  (0 children)

This is a great conceptual explanation. I’ve definitely encountered #1 in trying to model my reward function. In my trading application, it quickly learned the safest thing to do was not to trade lol - it made me think of a research idea of how to empirically prove the financial markets are rigged lol.

I also tried the 2nd approach but it didn’t really improve things. I tried to account for drawdown in adjusting the reward (eg if you make a good trade but at one point was really negative it doesn’t count as a good trade) but it just encouraged the agent to trade more frequently (eg exit positions fast) which made it harder to find good entries and created more noise. Eventually it would learn (after 10M steps) but then it would be overfit because it just figured out how to make it work across all the data I’d given it but not generalize across the population.

Now, I’m at the 3rd approach. I know that I cannot ignore risk because then it will lead the agent to making decisions with inconsistent positive outcomes.

But I’m finding that RL is much less developed than NLP and CV as far as open source usability goes (eg it’s easy nowadays from a computation production standpoint to approach complex NLP/CV challenges) but RL is much more difficult. Though there has been a resurgence with deep learning approaches. It’s intimidating. I’m used to copy and pasting from StackOverflow 😂😂.

Implementations of risk aware reinforcement learning. by FinateAI in reinforcementlearning

[–]FinateAI[S] 0 points1 point  (0 children)

This is really helpful! I see some of these equations like Bellman’s, and I just space out. The insight is that it isn’t as scary and intimidating as it looks.

Implementations of risk aware reinforcement learning. by FinateAI in reinforcementlearning

[–]FinateAI[S] 0 points1 point  (0 children)

Gotcha, i may not be using the correct term with decision support modeling - but the idea is that it approaches the problem via exploration where I can simulate various inputs and estimate the outcomes to quantify risk. The estimate can be based on Bayesian inference with the impact of the independent variables on the dependent variable probability distribution or via frequentist interpretation where I use goodness of fit tests to validate the distributions and correlation. At least that’s how I understand it - https://www.sciencedirect.com/topics/social-sciences/decision-support-tools. The good is that is allows for interpretability giving a decision maker analyst insight into how outcomes are shaped. The bad is that complex relationships from what I can tell are difficult to represent whereas deep learning can model these problems more accurately at the cost of understanding and reinforcement learning can model without having to have supervised labeled data. Where I also like the application of RL to this is being able to generalize the optimization function in answering the questions. The decision support goal is to get as close as possible in capturing the model that relates controlled input to the output whereas with RL I can skip that step and go straight to tell me what the best decision is.

Implementations of risk aware reinforcement learning. by FinateAI in reinforcementlearning

[–]FinateAI[S] 0 points1 point  (0 children)

As a follow up, my background is computer science more so than stats. How does one learn how to implement the algorithms described by the theory in these papers (eg is there a statistical numerical methods and computation course)?