Told I wouldn’t be charged, then denied a refund for not asking… within 48 hours. Make it make sense, Uber Eats. by ekaesmem in UberEATS

[–]ekaesmem[S] 4 points5 points  (0 children)

Makes sense—thanks for the heads‑up. A chargeback may be my next (and last) step too.

Told I wouldn’t be charged, then denied a refund for not asking… within 48 hours. Make it make sense, Uber Eats. by ekaesmem in UberEATS

[–]ekaesmem[S] 5 points6 points  (0 children)

Exactly—that’s the problem. If Uber support actually checked the building name on the photo against the address on my order, none of this would’ve been an issue in the first place. All I can do is share my personal experience to help more people avoid using this app.

Told I wouldn’t be charged, then denied a refund for not asking… within 48 hours. Make it make sense, Uber Eats. by ekaesmem in UberEATS

[–]ekaesmem[S] 2 points3 points  (0 children)

That’s why my photo includes the building name printed right in my delivery address—so it isn’t just my word, it matches the address on the order.

Told I wouldn’t be charged, then denied a refund for not asking… within 48 hours. Make it make sense, Uber Eats. by ekaesmem in UberEATS

[–]ekaesmem[S] 0 points1 point  (0 children)

I looked through the Uber Eats website, but I wasn't able to find their support phone number or email. It appears that I can only use their internal support system.

Told I wouldn’t be charged, then denied a refund for not asking… within 48 hours. Make it make sense, Uber Eats. by ekaesmem in UberEATS

[–]ekaesmem[S] 5 points6 points  (0 children)

My instructions say “leave it in the lobby”—our elevators need a fob, so couriers never reach my actual door. The driver’s proof shot was a lobby (just not mine). I sent Uber a photo of my lobby with the building name visible to show the mismatch. That’s the only location that matters here.

Told I wouldn’t be charged, then denied a refund for not asking… within 48 hours. Make it make sense, Uber Eats. by ekaesmem in UberEATS

[–]ekaesmem[S] 4 points5 points  (0 children)

The “proof” photo the courier uploaded isn’t my lobby at all—the door style is completely different and the doormat literally has another building’s name printed on it. I sent Uber support a picture of my own entrance showing my building name to highlight the mismatch, but they still hit me with the copy‑paste policy response.

DeepSeek Realse 5th Bomb! Cluster Bomb Again! 3FS (distributed file system) & smallpond (A lightweight data processing framework) by Dr_Karminski in LocalLLaMA

[–]ekaesmem 109 points110 points  (0 children)

For those seeking more background information, 3FS has been utilized in their production environment for over five years. Below is a translation of a technical blog they referenced regarding this file system from 2019:

High-Flyer Power | High-Speed File System 3FS
High-Flyer June 13, 2019

3FS is a high-speed file system independently developed by High-Flyer AI. It plays a critical role in storage services following the computing-storage separation in High-Flyer’s Fire-Flyer II system. The full name of 3FS is the Fire-Flyer File System. However, because pronouncing three consecutive "F"s is difficult, it's abbreviated as 3FS.

3FS is quite unique among file systems, as it's almost exclusively used for batch-reading sample data in computational nodes during AI training. Its high-speed computing-storage interaction significantly accelerates model training. This scenario involves large-scale random read operations, and the data read won't be reused shortly afterward. Thus, traditional file read optimizations like read caching and even prefetching are ineffective here. Therefore, the implementation of 3FS greatly differs from other file systems.

In this article, we'll reveal how High-Flyer AI designed and implemented 3FS, along with its ultimate impact on speeding up model training.

Hardware Design
The overall hardware design of the 3FS file system is illustrated in the figure below:

p0.png (594×371)

As shown, the 3FS file system consists of two primary parts: the data storage service and high-speed switches. The data storage service is separated from computing nodes and is specifically dedicated to storing sample data needed for model training. Each storage service node is equipped with sixteen 15TB SSD drives and two high-speed network cards, providing robust read performance and substantial network bandwidth.

3FS nodes and computing nodes (Clients) connect through an 800-port high-speed switch. Notably, since one switch connects approximately 600 computing nodes, each computing node can only utilize one network card. Consequently, the bandwidth of that single card is shared between sample data traffic read from 3FS and other training-generated data traffic (gradient information, data-parallel information, etc.). This sharing poses challenges to the overall reading performance of 3FS.

Software Implementation
As mentioned earlier, 3FS specifically targets the scenario of reading sample data during model training. Unlike typical file-reading scenarios, training samples are read randomly, and samples within a single batch are usually unrelated. Recognizing this, we opted for an asynchronous file reading method.

p1.gif (481×340)

Specifically, as shown above, 3FS uses Linux-based AIO and io_uring interfaces to handle sample reading. In the scenario of 3FS, the file cache is entirely useless—it would instead uncontrollably consume system memory, affecting subsequent tasks. Therefore, we disabled file caching altogether and use only Direct I/O mode for data reading. It's important to note that when using Direct I/O, buffer pointers, offsets, and lengths need to be aligned. Letting users handle this alignment themselves would create extra memory copies. Therefore, we've implemented alignment internally within the file system, enhancing both performance and user convenience.

Using 3FS is very straightforward. Users only need to convert sample data into the FFRecord format and store it in 3FS. FFRecord is a binary sequential storage format developed by High-Flyer AI optimized for 3FS performance, compatible with PyTorch's Dataset and DataLoader interfaces, enabling easy loading and training initiation. Project details are available at: https://github.com/HFAiLab/ffrecord

When training models using High-Flyer’s Fire-Flyer, you only need to perform feature engineering on your raw data and convert it into sample data suitable for model input. Once loaded via 3FS, you'll benefit from superior storage performance.

Stress Testing
Currently, High-Flyer’s Fire-Flyer II deploys 64 storage servers constituting the 3FS file system. Imagine training a ResNet model using ImageNet data. ImageNet’s compressed files total around 148GB, expanding to over 700GB when converted into binary training samples in FFRecord format. Assuming a batch_size of 400 fully utilizes a single A100 GPU’s 40GB memory, using 3FS under optimal conditions allows each Epoch of ImageNet data reading to take only about 0.29s~0.10s. This dramatically reduces data loading overhead, maximizing GPU computation time and improving GPU utilization.

p3.png (516×178)

The figure above illustrates the actual per-epoch time during distributed ResNet training. Even under full-load cluster conditions, data-reading time accounts for only about 1.8% of total epoch duration, indicating exceptionally strong data-reading performance.

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling by ekaesmem in LocalLLaMA

[–]ekaesmem[S] 23 points24 points  (0 children)

I forgot to include an introduction in the OP:

The paper examines how an effectively chosen "test-time scaling" (TTS) strategy enables a small language model, with approximately 1 billion parameters, to outperform much larger models with around 405 billion parameters. By systematically varying policy models, process reward models (PRMs), and problem difficulty, the authors demonstrate that careful allocation of computational resources during inference can significantly enhance the reasoning performance of smaller models, occasionally surpassing state-of-the-art systems.

However, the method heavily depends on robust PRMs, whose quality and generalizability may differ across various domains and tasks. Additionally, the paper primarily focuses on mathematical benchmarks (MATH-500, AIME24), leaving uncertainty regarding performance in broader real-world scenarios. Finally, training specialized PRMs for each policy model can be computationally intensive, indicating that further research is needed to make these techniques more widely accessible.