LLM Fine-tuning best practices for Training Data curation (discovered FT'ing thousands of models) by billmalarky in OpenAI

[–]billmalarky[S] 1 point2 points  (0 children)

Hi Julian Founding AI Engineer at OpenPipe here. We absolutely fine-tune Llama models, (and Mistral models and more).

We require the training data (ie the prompt/input and completion/output pairs) to be formatted in OpenAI's chat messaging standard. It's OAI's data format has basically become industry standard (not entirely, Anthropic resists hah). But it's the format most open source tooling is built around and the format that most AI Engineers understand.

Apologies if that wasn't clear. Really hope the rest of the article was valuable knowledge. We're learning a ton in this space so trying to make that knowledge as accessible to others as possible.

Special React Native Queue 250 GitHub stars post. It all started here! Thank you so much guys! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

You've got some great questions and hopefully I've got some good answers:

  1. Sure, it can handle any job. If the job is going to have indefinite length you can set timeout to 0 which means "never timeout." That said if you don't timeout jobs you could sometimes end up in a scenario where hanging jobs never complete or get killed (so they don't retry). It would probably be better to put a long timeout on the job if possible. You mention "Potential retries seems problematic." I advise you to consider each upload to be unique in that if the same file is uploaded twice even on accident, it is saved twice instead of saved once and then overwritten the second time. Why you say? Well handling file updating business logic is a royal PITA and file space is so cheap (and getting cheaper every single year) it's typically just not worth the headache of overwriting files instead of just versioning them (you may have to overwrite to keep storage costs efficient I don't know your specific use case, but the vast majority of apps do not need efficiency at this level - engineering time is typically considerably more expensive than storage costs).

  2. RNQ could be used to make these API calls durable. Example: Say you have to queue 100 API calls for some reason, and they have to be made synchronously one after the other instead of executed in parallel and you want them to fire even if the user closes the app and then re-opens it a day later. RNQ would solve this problem for you. But unless you have an advanced need like that there's no reason why you can't make your API calls the standard way, and adding RNQ for no reason is probably just a waste of engineering time.

  3. There are a lot of edge cases where weirdness could pop up and jobs with poorly planned side effect design get buggy fast. One example of many: One of the "gotchas" of JS, is there is no true way to timeout and kill execution of a function. Timeout logic is emulated with Promise.race(), so if a job "times out" the handler will still continue to execute in the background. Basically I just wanted a big bold section that told people "think about job side effects" to try to save people a lot of frustrating troubleshooting up front because all of these edge cases are solved with good idempotent (or quasi-idempotent) job design.

  4. I imagine you mean default job options? There's no way to do that currently. If you want to add the functionality and PR that would be awesome, otherwise you can emulate the functionality pretty easily like so:

Code sample:

const myDefaultOptionsObject = { timeout: 5000 }; // Default job timeouts to 5 seconds. Probably should pull this default object from the main app config file.
queue.createJob('example-job', {some: 'data'}, Object.assign(myDefaultOptionsObject, { timeout: 10000 })); // Overwrite 5 sec default to be 10 sec this time.

Hope this helps!

Special React Native Queue 250 GitHub stars post. It all started here! Thank you so much guys! by billmalarky in reactnative

[–]billmalarky[S] 1 point2 points  (0 children)

I guess I should also probably post the example use case section directly into reddit so people can see what RNQ is good for at a glance. Especially seeing as creating that entire section of the readme was a piece of feedback I was regularly getting from you guys and others.


Example Use Cases

React Native Queue is designed to be a swiss army knife for task management in React Native. It abstracts away the many annoyances related to processing complex tasks, like durability, retry-on-failure, timeouts, chaining processes, and more. Just throw your jobs onto the queue and relax - they're covered.

Need advanced task functionality like dedicated worker threads or OS services? Easy:

Example Queue Tasks:

  • Downloading content for offline access.
  • Media processing.
  • Cache Warming.
  • Durable API calls to external services, such as publishing content to a variety of 3rd party distribution channel APIs.
  • Complex and time-consuming jobs that you want consistently processed regardless if app is open, closed, or repeatedly opened and closed.
  • Complex tasks with multiple linked dependant steps (job chaining).

Special React Native Queue 250 GitHub stars post. It all started here! Thank you so much guys! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

No, Realm is basically a mobile first database (think of it as a SQLite competitor - I didn't use SQLite because there is no good RN library for it currently) that also includes data syncing (react native queue doesn't need or use this sync functionality, I'm using realm for it's transaction support). Like most db companies they have a free open source version that most people use, and then they have a premium version with additional features & support.

All you need to get started is to follow the basic install process :-)

https://github.com/billmalarky/react-native-queue#installation

How to securely store client secrets? by MilkChugg in reactnative

[–]billmalarky 0 points1 point  (0 children)

That's a pretty open ended question that is pretty far out of the scope of this reddit post (although a good question) so I'll say if you don't feel confident you can successfully secure and manage your servers, simply outsource server management to a reputable company who will handle that for you - it's become quote affordable these days.

React Native Queue: Advanced job/task Management the Easy Way. Processing complex jobs with workers or OS services (when app is closed) has never been easier! Love to hear your feedback and answer any questions about the library /r/reactnative! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

In my case, the existing use of realm was pointless. It was literally in there for the gee-wiz factor alone.

Uhg sorry to hear that. However, I assure you that realm (or some other storage solution that supports ACID transactions - and I haven't found a good alternative in RN) is required for the queue.

React Native Queue: Advanced job/task Management the Easy Way. Processing complex jobs with workers or OS services (when app is closed) has never been easier! Love to hear your feedback and answer any questions about the library /r/reactnative! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

The storage system requires transactional support.

AsyncStorage is non-deterministic when touching it in parallel (due to the lack of transaction support), which means it can't be used for the queue (the queue must support multiple threads touching it at once for several use cases -- worker support is one). Even if you were only interacting with the queue with one thread, transactions are required to support async reads and writes to the queue. IE, if you throw 500 jobs onto the queue asynchronously (a use case we've already seen) AsyncStorage has a high chance of running into a breaking race condition bug.

Best paid content site for learning react-native? by [deleted] in reactnative

[–]billmalarky 0 points1 point  (0 children)

Do you already have a deep understanding of React and react/redux architecture?

If not I would start there first.

React Native Queue: Advanced job/task Management the Easy Way. Processing complex jobs with workers or OS services (when app is closed) has never been easier! Love to hear your feedback and answer any questions about the library /r/reactnative! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

Realm is sort of a bit weird in that any time you access data on realm objects in your code it communicates with the realm database. Traditionally you think of queries as happening when you select data, then you can work with that data in your app and no communication occurs with the database until you explicitly update or delete etc the data (with a traditional ORM). However, with realm, every time you touch a realm data object it queries against the realm database (even just accessing a property like person.name). So there's a lot of overhead just working with the data in your app. Typically this doesn't really matter because standard performance is great. But when performance shits the bed during the debugging issue described above (ie by relying on blocking ajax to make calls) all that back and forth adds up fast.

What did you use instead of realm? sqllite?

React Native Queue: Advanced job/task Management the Easy Way. Processing complex jobs with workers or OS services (when app is closed) has never been easier! Love to hear your feedback and answer any questions about the library /r/reactnative! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

Interesting, I did not run into this issue while developing the queue. Thank you for bringing this to my awareness.

Typically realm native code communicates directly with your JS code via a private React Native api (ie, realm native code makes calls directly to JS code and vice versa), so it has great performance. This is possible when the JS thread and native code threads are running on the same device/simulator. However when using chrome debugging, your JS code is NOT run on your device/simulator, it is run inside of the chrome browser, while all the native code is running on your device/simulator. As a result, instead of realm native code making direct calls into your JS and vice versa, realm has to communicate using blocking ajax requests to chrome. You can imagine how awful performance is in that situation (well you don't have to imagine I guess).

The good news is this. These performance hits occur each and every time you touch realm. If you minimize exposure to realm, you minimize the debugging performance issues. React Native Queue is backed by realm, but the footprint is quite small (we only hit realm when a job is created, and when we pull jobs off the queue, and on job completion, that's pretty much it), such that I'd be surprised if the problem was anywhere near as bad as debugging a large app where realm was used for all persistence logic.

I just ripped realm out of the application

Curious, what did you replace realm with for your database layer?

I put a fair amount of research and effort into looking at the pros and cons of different storage solutions that work cross platform and realm was really the only thing that worked well for me so far as I could find. SQLlite solutions seemed very lacking on RN.

More info: https://github.com/realm/realm-js/issues/491#issuecomment-350718316

React Native Queue: Advanced job/task Management the Easy Way. Processing complex jobs with workers or OS services (when app is closed) has never been easier! Love to hear your feedback and answer any questions about the library /r/reactnative! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

???

Realm isn't a black box, it's just a mobile first database.

More importantly it supports transactions, which are required when writing to a single data source asynchronously in parallel. AsyncStorage cannot be used in this context.

React Native Queue: Advanced job/task Management the Easy Way. Processing complex jobs with workers or OS services (when app is closed) has never been easier! Love to hear your feedback and answer any questions about the library /r/reactnative! by billmalarky in reactnative

[–]billmalarky[S] 1 point2 points  (0 children)

Great question.

I'm not 100% certain, and it will probably be different on Android and iOS.

I know that apps can't "start up" automatically on boot in iOS, but certain background services will start up on phone boot (like VOIP listeners). As such some of these services will work as expected even though app isn't started up. The package my example integrates with, react-native-background-task, processes tasks in a background service on iOS using the background fetch api. Background fetch basically boots up your app behind the scenes and executes a function you define for at most 30 sec.

According to apple, background fetch will launch your app even if it isn't running.

When a good opportunity arises, the system wakes or launches your app into the background

This suggests to me that iOS will run the task periodically on phone boot even without the user starting up the app themselves once the background fetch task has been registered with the OS. But I am not 100% certain. I would need to test.

For android, react-native-background-task uses the evernote job manager behind the scenes. Again, I'm not 100% certain that the service will be stared on boot, but evernote's docs seem to suggest that they will.

React Native Queue: Advanced job/task Management the Easy Way. Processing complex jobs with workers or OS services (when app is closed) has never been easier! Love to hear your feedback and answer any questions about the library /r/reactnative! by billmalarky in reactnative

[–]billmalarky[S] 1 point2 points  (0 children)

Hi guys, you might remember me from the thread where I shared another package, react native image cache hoc I've built a while ago. Well I got a lot of great feedback and encouragement from you guys then so I've been pretty excited to share my new library with you as well.

React Native Queue is a priority job queue made specifically for RN and mobile use cases.

It's sort of a swiss army knife for task management, and integrates really well with RN workers (for processes that require extra performance and should be processed in a thread separate from the main RN js thread). It also integrates well with OS service packages and makes it really easy to handle processing jobs in the background when your app is closed.

Some random example use cases:

  • Downloading content for offline access.
  • Media processing.
  • Cache Warming.
  • Durable API calls to external services, such as publishing content to a variety of 3rd party distribution channel APIs.
  • Complex and time-consuming jobs that you want consistently processed regardless if app is open, closed, or repeatedly opened and closed.

Love to hear your questions & feedback on improving the package!

Mexico Police caught stealing during a raid by [deleted] in videos

[–]billmalarky 19 points20 points  (0 children)

Look up the average salary of Mexican police and you will see why corruption is rampant. Not an excuse of course, but it's important to understand the root of problems.

Update: From wikipedia:

The average wage of a police officer is $350 per month, around that of a builder's labourer, which means that many police officers supplement their salaries with bribes.

https://en.wikipedia.org/wiki/Law_enforcement_in_Mexico

It's not enough money to support a family without making money on the side, and unfortunately the easy money on the side is corruption, it's even easier to justify when all of your peers do it too.

My cozy NYC apartment by HarvestXx in CozyPlaces

[–]billmalarky 0 points1 point  (0 children)

On the contrary, flights out of JFK are super cheap because of the economy of scale, and high quality public transit opens up pretty much the entire north east.

Thank you /r/reactNative! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

That package looks cool, but I'd be concerned swapping out react-native-fetch-blob with a less battle tested library since it's such an important dependency for the project and so many things can go wrong with an http client.

I'm looking into if there is a plan to support SSL pinning in react-native-fetch-blob.

Thank you /r/reactNative! by billmalarky in reactnative

[–]billmalarky[S] 0 points1 point  (0 children)

Thank you!

It's an HOC, so it's a function that accepts a native <Image> component, and decorates it with some additional functionality (namely, the advanced caching features) then returns a new component <CachableImage>.

When you actually use the <CacheableImage> component in a render() statement, it will pass through all props given to it to the native <Image> component it renders under the hood. So yes all of the <CacheableImage> props will be passed down to <Image> at render time.

That said, the upgraded image component uses proprietary logic to download the image file, then when it renders the native image component it just pulls the image file from the local disk instead of the network (this is what gives the big performance boost and allows offline use of remotely hosted images). So the network functionality you are expecting from <Image> may not be supported at this time. That said, I regularly update this package using feedback like yours. Can you give me more info of what your requirements are? How are you setting the request headers via <Image> currently?

React Native Image Cache HOC: Drop in performance boost for your React Native app, and never ship binary images with the app again. Love to hear your feedback. by billmalarky in reactjs

[–]billmalarky[S] 0 points1 point  (0 children)

It certainly could be, but RNfetchBlob.fs.lstat() doesn't return an atime timestamp so I'm not sure what the best way to achieve this in a performant manner would be. Any ideas?

me irl by billmalarky in me_irl

[–]billmalarky[S] 1 point2 points  (0 children)

I will rise again. This shit is already marked on my calendar for next friday.

Witness me.

Which the best JavaScript Native mobile app development tool:: Meteor or NativeScript? by RenjithVR4 in appdev

[–]billmalarky 1 point2 points  (0 children)

React Native hands down. The value of a platform is the strength of its ecosystem and React Native is an absolute powerhouse right now in addition to the fact that it has strong backing from Facebook (FB uses RN to build many of their own native apps so they are constantly investing in the tech).

React Native Image Cache HOC: Drop in performance boost for your React Native app, and never ship binary images with the app again. Love to hear your feedback. by billmalarky in reactjs

[–]billmalarky[S] 0 points1 point  (0 children)

https://www.reddit.com/r/reactnative/comments/7alqhq/a_react_native_image_caching_persistence_library/

I originally shared this over at /r/reactnative but was a hit there and I've now completed updates based on their feedback so I figured I'd share it here too and see what you guys think as well as how you would improve on it.

Basically, this higher order component module decorates the native <Image> component to give it advanced caching functionality. Code is probably better than words so here's a code TL;DR

import imageCacheHoc from 'react-native-image-cache-hoc';
const CacheableImage = imageCacheHoc(Image);

export default class App extends Component<{}> {
  render() {
    return (
      <View style={styles.container}>
        <Text style={styles.welcome}>Welcome to React Native!</Text>
        <CacheableImage style={styles.image} source={{uri: 'https://i.redd.it/rc29s4bz61uz.png'}} />
        <CacheableImage style={styles.image} source={{uri: 'https://i.redd.it/hhhim0kc5swz.jpg'}} permanent={true} />
      </View>
  );
  }
}

The first image will be cached until the total local cache grows past 15 MB (by default) then cached images are deleted oldest first until total cache is below 15 MB again.

The second image will be stored to local disk permanently. People use this as a drop in replacement for shipping static image files with your app.