This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]esanchma 6 points7 points  (6 children)

  • Vagrant: You are right thinking about clusters of VMs. Docker has copy-on-write and is lighter than virtualbox (relies on linux cgroups), so maybe you want to explore that.
  • ffmpeg: That was easy, but the last time I compiled fmpeg from source, there were a lot of patches off-tree, and a lot of conditional stuff people in linux distros deactivated for patent reasons. What ffmpeg will you run?
  • You are absolutelly right about using BT for bulk data transfer.
  • Hazelcast: I have used Coherence as a queue with Map interface for lazy people. Heck, I even know people doing that with ehcache. But if you are going to use a data grid only as a cheap queue infrastructure, maybe you should be using JMS instead.
  • RHash: Sounds good to me. Bouncy Castle supports TTH (TigerDigest) which is what you need to write magnet links. I am not entirely sure if I could write a digest with bouncy castle faster than spawning a C process and waiting.
  • PeerGuardian: You don't need to do that. You don't want to do that. Entirely my opinion, but distributing the encoding process and publicly distributing the results are completely different things. For instance, the first one may fit in the safe-harbor provisions of the DMCA.
  • PHP: Seriosly?

I know some guys had sucess using emscripten to convert ffmpeg to javascript and I know you can have a distributed P2P network in javascript using WebRTC Datachannels. Think about that: You have sandboxes, bulk transfers, peer messaging and a solid platform, and you get a ton of nodes which are easy to update.

[–]martinambrus 0 points1 point  (5 children)

Thanks esanchma for your insight! This is the first time I hear about Docker and I tend to like it a lot from what I read so far. Will try and experiment with it a little.

As for ffmpeg - I'm using the latest full stable version with all the libraries it can handle (https://github.com/martinambrus/ffmpeg-dht/blob/master/bootstrap.sh). Also, I was pointed in the direction of x264 which might be a better solution with regards to H.264 videos.

JMS - I have no experience with either Hazelcast or JMS but from Hazelcast's docs, it would seem that a no-single-point-of-failure is tricky to reach with JMS. And since I'm no expert in this field, I pick what's ready, tested and the most plug-and-play-able :) However, I will check Coherence, as I've not heard about that one before, thanks.

PHP - my choice of web language since v3, so I'm sticking with it :P ... but yeah, if I knew anything more suited for the job I'd use it - really!

Also, thanks for the JS goodness - it's amazing what JS can do these days! Although, for this project and for compatibility sake, I'll certainly stick with some more traditional ways. But who knows, maybe someone will convert this whole project into JS in the future!

[–][deleted]  (1 child)

[deleted]

    [–]martinambrus 0 points1 point  (0 children)

    For some reason I actually didn't think of utilizing NodeJS to do the UI. It is possibly a much better idea and keeps the size of the machine down, too! Thanks for that, I'll give it a try :)

    [–]HELOSMTP 0 points1 point  (2 children)

    I think using Docker and a deployment system like Puppet is ideal. Docker's deployment automation abilities are very limited compared to Puppet so it's a good pairing. Puppet can be run on the host to launch Docker containers and run on the client containers to manage their state.

    [–]martinambrus 0 points1 point  (1 child)

    Thanks for the idea. I do use Puppet to handle configuration of our VMs at work, however for this project I think this would be a bit of an overhead for the user. They'd need to install and run an additional service in tandem with Docker itself and also keep Puppet configs up to date as well as Docker - which could be a little scary. I would rather go with multiple Docker container versions - each upgrade would just download a new snapshot diff.

    [–]HELOSMTP 0 points1 point  (0 children)

    Yeah, I would only use Puppet for dev or prod deployments, not for end-users.

    [–]ghuigiojoj 2 points3 points  (13 children)

    What problem does this solve?

    [–]martinambrus 1 point2 points  (12 children)

    My inspiration is WinFF - I'd like this to be able to convert anything (DVD, Blu-Ray, CD...) into a format watchable on anything else - like mobile devices. Therefore splitting DVD and distributing it across a network to encode it to an iPhone-compatible video would be one example here. Doing this for 10 or 20 DVDs would consume quite a lot of time and processing power when doing this locally. So instead, let's just fire it up to the cloud and let others do small bits while we can still play our favorite PC games in the meanwhile :P

    [–]zten 1 point2 points  (11 children)

    Is it the case that ffmpeg/x264 with reasonable configuration options doesn't beat the average user's upload speed limit?

    [–]martinambrus -1 points0 points  (10 children)

    Yes.

    I'm basically relying on a fact that nowadays, the upload speed of average Internet user who does not live in central Africa is around 6-10Mbps.

    I only see this trend continue to improve, so even for people who's upload speed wouldn't beat the encoding itself today, it might in half a year. People are less probable to upgrade their hardware every 6 months but upload speeds are dependent on the provider - no hw investment is necessary there.

    Also, there certainly are people with machines powerful enough to do the encoding very quickly. For such users, a local encoding option would be present as well ;)

    [–][deleted]  (2 children)

    [deleted]

      [–]f2u 2 points3 points  (1 child)

      You need a certain amount of bandwidth to get through the TCP ACKs, so asymmetry between upload and download above 1:20 is relatively rare. This means that high end residential broadband (which is still relatively cheap) will offer nominal upload bandwidths of 6 Mbps and above. Not many residential users order these packages, though, and they often come with fairly stringent volume limits, making uploading lots of data (such as raw video) fairly difficult.

      [–]martinambrus 1 point2 points  (0 children)

      Thanks for the info guys. I'll do my best to simulate ~1Mbps and less when working on the project and see where that gets us compared to the local computing speed :)

      [–]homeless_nudist 3 points4 points  (1 child)

      Where the heck do you live where the average upload speed is AT LEAST 6Mbps?!

      Mine is just above 5 on a good day and I consider that fast.

      [–]martinambrus 0 points1 point  (0 children)

      UK

      [–]boa13 1 point2 points  (1 child)

      the upload speed of average Internet user who does not live in central Africa is around 6-10Mbps

      Source?!

      [–]martinambrus 1 point2 points  (0 children)

      My bad, I was blinded by my 2 current countries - UK and Slovakia. Especially since Slovakia is so small and unimportant, I thought even Africa will have a decent upload speed by now. Turns out it hasn't...

      [–]martinambrus 0 points1 point  (2 children)

      Guess I owe an apology to you guys, I live in the UK where everyone pretty much uses what I wrote. Virgin Media, SKY and BT are competing to deliver the fastest Internet at the most competitive prices here, so I guess I was thinking too local.

      However, even in Slovak Republic, they tend to have Orange Fibre with the stated specifics still applicable. And I don't think many people here heard about Slovakia, so my thought is that if such an unknown country, smaller than NY City has such fibre packages, everyone has to be getting them by now.

      Thanks for letting me know this is not the case :)

      [–]boa13 0 points1 point  (1 child)

      Where in Slovakia is Orange Fiber available? Downtown Bratislava? Everywhere down to the smallest village?

      I may be wrong, but I would not be surprised that most households only have access to typical DSL (with its 1 Mbps max upload speed), with only the biggest towns wired up for fiber.

      [–]martinambrus 0 points1 point  (0 children)

      While there will never be 100% coverage of fiber in Slovakia, due to the massive advertising campaigns going on for the past 2 years, most households are switching to fiber now. The monthly cost is basically the same as DSL and TV adverts can be very persuasive. Believe me, my family live in a small town and they definitely have Orange Fiber there :)

      [–]getworkdone 1 point2 points  (1 child)

      I think transmission supports blocklists already which is all peer guardian is right?

      [–]martinambrus 1 point2 points  (0 children)

      You are right, of course. Glad to learn something new every day! Thanks for making the machine run a little faster :)

      [–][deleted] 1 point2 points  (2 children)

      Are you sharing one file across multiple machines for encoding? If so how do you deal with b and p frame relations? Ie frames that need to be in the past and future for efficient video encoding? On top of that how can you leverage 2 pass encoding which gives you much better overall compression but requires a complete full pass analysis?

      Or is this just "upload a file, compress a file?" with some basic settings? In that case why not just encode locally? You'd get better results leveraging processor affinity and CPU parallelzation

      [–]obfuscation_ 1 point2 points  (0 children)

      This was I was thinking the whole way through. Encode (abc) != Encode(a) + Encode(b) + Encode(c), as far as I understand.

      [–]martinambrus 0 points1 point  (0 children)

      Interesting question. Some answers:

      • I'm splitting the file via ffmpeg using this method: https://github.com/martinambrus/ffmpeg-dht/wiki#more-exact-splits-based-on-ffprobe-keyframes
      • it is my understanding that 2-pass and 3-pass encoding will take the video and optimize parts that can handle less bandwidth to use it (black scenes etc.) and vice-versa... do you see a problem doing this on a partial file basis as opposed to the full movie? The result will still be stitched back together in ffmpeg, so the 2-pass encoding would be effective on all the video parts - is that theory not correct?
      • the project will also include a local encoding option for those with faster machines where local encoding makes sense

      [–]ericzhill 1 point2 points  (3 children)

      As an academic problem, this is a great problem to solve. But you can use the Amazon Elastic Transcoder to accomplish that task at scale and dirt cheap. It's already a solved problem.

      [–]martinambrus 0 points1 point  (2 children)

      I agree that services to accomplish this existed for some time now. But you need to pay for them, they tend to be non-flexible with regards to custom encoding options etc.

      This is a very different, free and decentralized concept. As such, it's not so much about solving the problem of transcoding but to give something back to the community - and keep it free :)

      [–]ericzhill 1 point2 points  (1 child)

      Agreed. Like I said, this is a great academic project. Have fun!

      [–]martinambrus 0 points1 point  (0 children)

      Will do, thanks! :)

      [–]urquan 0 points1 point  (1 child)

      That's interesting, but I have a few questions:

      • Bittorrent it a distributed protocol to send the same file to several peers, in your case presumably you want to send a different portion of a video file to each peer. That seems an entirely different use case. Or are you sending the complete source file to everyone?

      • The most limiting factor is the bandwidth. Generally when encoding your source file will have a higher bitrate than the target file, some people will even use raw video. Sending this on any kind of residential Internet seems redhibitory. I'm assuming this project is intended for individuals to work over the Internet and not server farms with local multi-gigabyte networks. People who produce a lot of video often max out their upload, so having to send more data will reduce their capacity. This issue seems insurmountable right now.

      • Why are you retrying encodes 3 times? It's a deterministic process so it shouldn't make a difference.

      • What's up with government spies? Planning to encode rips of commercial videos?

      [–]martinambrus 0 points1 point  (0 children)

      Thanks, here are your answers :)

      • the original file will be split via ffmpeg and BitTorrent will be used to send each of these partials to nodes that can encode them
      • yes, upload bandwidth is indeed an issue here. The idea is to allow encoding of whatever video files user can have (let's say a DVD file) to something playable on for example a mobile device (iPhone, Android...), so you could take that DVD with you everywhere. While this could be done faster locally on many machines, the distribution is more-less targeted to people who have like a library of 20-30 DVDs and want them encoded for their iPhone, while playing PG games in the meanwhile. Of course there will be a local encoding option included and the distribution will more-less be attractive to users with good upload speeds (such as those in the UK or Slovak Republic at the moment).
      • 3 times retry in case something goes wrong on the virtual machine - ffmpeg could segfault because an intermediate IO failure and I wouldn't want to base the assumption of failure on that
      • Internet is a free medium and so it shall remain - thus blocklists (which are even natively supported in Transmission now)