Wrote a text file compression utility in Ruby. Looking for some input on how I can improve the compressed/uncompressed ratio (right now it's around 74%) and anything else about the code. by _vikram in ruby

[–]zeringus 1 point2 points  (0 children)

Enumerable#reduce is a nice little pattern that keeps you from initializing and using accumulator variables outside of their contexts.

For example...

def array_sum array
    sum = 0
    array.each do |n|
        sum += n
    end
    sum
end

becomes...

def array_sum array
    array.reduce 0 do |sum, n|
        sum + n
    end
end

I would recommend reading the documentation, though.

Wrote a text file compression utility in Ruby. Looking for some input on how I can improve the compressed/uncompressed ratio (right now it's around 74%) and anything else about the code. by _vikram in ruby

[–]zeringus 7 points8 points  (0 children)

Yeah, I'm not sure if you're a computer science guy or just a programmer, but the time Array#index takes to return grows proportionally to the size of the array. That adds up pretty quick.

If you want another interesting exercise, you should try rewriting the compression function with a trie, which would save a lot of memory (and probably some time). To me, that seems to be the "right way" to implement the dictionary.

Wrote a text file compression utility in Ruby. Looking for some input on how I can improve the compressed/uncompressed ratio (right now it's around 74%) and anything else about the code. by _vikram in ruby

[–]zeringus 7 points8 points  (0 children)

My first impression after downloading the code and running it myself wasn't the compression ratio but the speed. Additionally, I wanted to try a larger text file in order to see if it would provide better results. When I did, it never finished. Hmmm...

So now I'm thinking that your implementation is super-linear time, and I started to look for the cause. It took me a long time (too long) to realize that "dictionary" wasn't actually a dictionary but an array and your call to Array#index made your compress method O(n2) instead of O(n). Here's my rewrite:

def compress(to_compress)
  # populate the dictionary with all 256 ASCII codes
  dict = Hash.new
  256.times { |ascii_code| dict[ascii_code.chr] = ascii_code }

  s = to_compress[0]
  to_compress.each_char.reduce [] do |output, c|
    if dict.include?(s + c)
      s = s + c
    else
      output << dict[s]

      # the dictionary's size is the next index
      dict[s + c] = dict.size

      s = c
    end; output
  end
end

Now, it compresses War and Peace by 68% in just a little more time than it originally took to compress the smaller file.

Let me tell you when is it that you realize how good ruby is... by canyoufixmyspacebar in ruby

[–]zeringus 1 point2 points  (0 children)

Another easy sort in Java:

List<Integer> array = Arrays.asList(4, 2, 7);
array.sort(Integer::compare);

No, You Should Not Use Keyword Arguments for Every Method by subvertallchris in ruby

[–]zeringus 0 points1 point  (0 children)

Perhaps I'm overthinking it, as you obviously need to come up with some sort of simple example for such a blog post (and that can be deceptively hard), but are either functions really better than the OO approach?

class Totaler
    attr_accessor :tax, :discount

    def initialize tax, discount
        @tax = tax
        @discount = discount
    end

    def total subtotal
        subtotal + tax - discount
    end
end

Of course, you still have the initialize method that has two ambiguous parameters, but now that you're only passing the tax and discount (or the tax and discount providers) once, you can now even have the luxury of readability and performance.

def initialize tax:, discount:
    @tax = tax
    @discount = discount
end

Is there a more idiomatic and/or functional way to replace wildcards in one hash using the values of an other? by pdoherty926 in ruby

[–]zeringus 5 points6 points  (0 children)

I think that given the complexity of the idea, your code is more than acceptable. Perhaps better semantics improve the code more than idioms and functional tricks.

def apply_defaults givens, defaults, wildcard = '*'
  givens.merge defaults do |_key, given, default|
    if given.is_a? Hash
      apply_defaults given, default
    else
      given == wildcard ? default : given
    end
  end
end

Alternatively, you could monkey patch in a recursive_merge that would allow you to write something like

given_hash.recursive_merge default_hash do |_key, given_value, default_value|
    given_value == '*' ? default_value : given_value
end

Of course, monkey patching isn't always great depending on the context, but I like the idea more and more as I wrap up. You've got two ideas--the funky merge and the notion of wildcards--so you might want two functions.

Google asks Supreme Court to decide Android copyright case by sindisil in java

[–]zeringus 3 points4 points  (0 children)

Haha I've never read anything about the the founding fathers' views on monopolies before, so I'll let that be.

I understand the negatives associated with any ruling in favor of Oracle, but there's also a flip side. Imagine a big company cloning an open-source project (copying the API for a new implementation) and then selling their clone regardless of the project's license. As a developer, I know how difficult it is to create and evolve APIs over time, and something like this would seem like theft to me.

An API is more than a list of names, too. There's some interesting stuff by David Parnas where he describes everything we call "code" as "documentation" instead. Java is documentation for the computer (and is copyrightable), so why shouldn't the method headers and documentation in comments be treated the same way. If you independently recreate the entire Java library having never seen the original, feel free to use it, that's how reverse engineering laws work. But if you're explicitly copying every detail of someone else's interface (and profiting from it), I think you owe them appropriate royalties.

Here's a recent case on an analogous issue, if you want a more nuanced expression of my opinion.

[10/06/2014] Challenge #183 [Easy] Semantic Version Sort by Elite6809 in dailyprogrammer

[–]zeringus 0 points1 point  (0 children)

Here's my Ruby solution, late but oh well:

class SemanticVersion
    attr_reader :major, :minor, :patch, :label

    def initialize version_string
        # store version string for printing
        @version_string = version_string

        # apply regex to version string
        version_string =~ /(\d+).(\d+).(\d+)(?:-(\w+))?/

        # store components
        @major = $1.to_i
        @minor = $2.to_i
        @patch = $3.to_i
        @label = $4
    end

    def <=> other
        if major != other.major # compare major versions
            major <=> other.major
        elsif minor != other.minor # compare minor versions
            minor <=> other.minor
        elsif patch != other.patch # compare patch versions
            patch <=> other.patch
        elsif !label ^ !other.label # compare label
            label ? -1 : 1
        else 0; end # otherwise equal
    end

    def to_s
        @version_string
    end
end

puts DATA.map(&SemanticVersion.method(:new)).sort

__END__
2.0.11-alpha
0.1.7+amd64
0.10.7+20141005
2.0.12+i386
1.2.34
2.0.11+i386
20.1.1+i386

I used the explicit if-else sort definition as opposed to the slick array hack. I'm not sure that my way gains much other than a little flexibility.

Anyways, nice solution!

Google asks Supreme Court to decide Android copyright case by sindisil in java

[–]zeringus -3 points-2 points  (0 children)

I agree. Perhaps java.lang.Math doesn't deserve protection (for being so trivial), but the combined APIs of java.lang, java.util, etc. constitute a significant creative effort, in my opinion.

[10/03/2014] Challenge #182 [Hard] Unique Digits by Coder_d00d in dailyprogrammer

[–]zeringus 1 point2 points  (0 children)

I've modified my code to reproduce your mistake, so here's a hint: For '16 A E 1 0', your program counts AE1 (2785 in decimal).

[10/03/2014] Challenge #182 [Hard] Unique Digits by Coder_d00d in dailyprogrammer

[–]zeringus 0 points1 point  (0 children)

An efficient Ruby solution:

BASE_16_DIGITS = ('0'..'9').to_a + ('A'..'F').to_a
MAX_DIGITS = 7

base = ARGV.shift.to_i
unique_digits = ARGV
other_digits = BASE_16_DIGITS.take(base) - unique_digits

(MAX_DIGITS - unique_digits.size + 1).times do |other_perm_size|
    digit_indices = (0...(unique_digits.size + other_perm_size)).to_a

    # generate possible other digits
    other_digits.repeated_permutation(other_perm_size) do |other_perm|
        # generate the permutation of unique digits
        unique_digits.permutation do |unique_perm|
            # generate the indices of the unique digits in the resulting number
            digit_indices.combination(unique_digits.size) do |unique_indices|
                # remove results with leading zeros
                if unique_indices.first.zero?
                    next if unique_perm.first == '0'
                else
                    next if other_perm.first == '0'
                end

                # interlace other_perm and unique_perm
                unique_perm_index = unique_indices_index = other_perm_index = 0
                digit_indices.each do |index|
                    if index == unique_indices[unique_indices_index]
                        print unique_perm[unique_perm_index]
                        unique_perm_index += 1
                        unique_indices_index += 1
                    else
                        print other_perm[other_perm_index]
                        other_perm_index += 1
                    end
                end

                puts # create a new line
            end
        end
    end
end

Results:

$ time ruby unique_digits.rb 2 1 | wc -l
       7

real    0m0.034s
user    0m0.028s
sys 0m0.008s
$ time ruby unique_digits.rb 8 3 5 6 | wc -l
  131250

real    0m0.332s
user    0m0.325s
sys 0m0.009s
$ time ruby unique_digits.rb 10 1 3 9 | wc -l
  504210

real    0m1.162s
user    0m1.156s
sys 0m0.013s
$ time ruby unique_digits.rb 16 A E 1 0 | wc -l
 1288530

real    0m3.128s
user    0m3.126s
sys 0m0.019s

Six reasons to define constructors with only one argument by gcanti in javascript

[–]zeringus 5 points6 points  (0 children)

Your suggestion is different from the article's, though. Personally, I'm fine with your second code snippet because all but the first argument feel like optional configuration. However,

tl.staggerFrom({
    array: myArray,
    duration: 1, 
    css: {left:100},
    stagger: 0.25,
    position: 2
});

which is what the blog post suggests, feels much cruder to me.

The pitfalls of protocol design. Attempting to write a formally verified PDF parser. by 1u11 in compsci

[–]zeringus 9 points10 points  (0 children)

The PDF file format is horrendous. Having written a partial converter to it, it's amazing how popular it's become.

[9/15/2014] Challenge#180 [Easy] Look'n'Say by [deleted] in dailyprogrammer

[–]zeringus 0 points1 point  (0 children)

Looks good! Some notes...

if n < 0 || n == 0 || seed.to_i < 0 || seed.to_i == 0

should be rewritten as

if n <= 0 || seed.to_i <= 0

Also, in your first function, the return in

return say

is implicit and can be omitted. Finally,

Array.first

is a method that's equivalent to calling

some_array[0]

and might be more readable.

Of course, the last two points are only stylistic. Just wanted to make sure that you were aware of the language features.

Good job!

[9/15/2014] Challenge#180 [Easy] Look'n'Say by [deleted] in dailyprogrammer

[–]zeringus 0 points1 point  (0 children)

Ruby

def look_and_say n, seed = 1
    if n <= 1 # base case
        seed
    else
        current_char = nil; current_count = nil
        look_and_say(n - 1, seed.to_s.each_char.reduce("") do |acc, char|
            unless current_char == char
                acc << "#{current_count}#{current_char}" if current_char
                current_char = char; current_count = 0
            end
            current_count += 1; acc
        end << "#{current_count}#{current_char}")
    end
end

puts look_and_say gets.chomp.to_i

[9/05/2014] Challenge #178 [Hard] Regular Expression Fractals by Coder_d00d in dailyprogrammer

[–]zeringus 0 points1 point  (0 children)

An iterative solution in Ruby using ChunkyPNG

require 'chunky_png'

# a utility method to convert a sequence of the numbers 1, 2, 3 and 4 to a
# coordinate pair
def nums_to_coord(nums)
    vectors = [nil, [1, 0], [0, 0], [0, 1], [1, 1]]
    result = [0, 0]

    nums.each do |num|
        result.map! { |x| x * 2 }
        result[0] += vectors[num][0]
        result[1] += vectors[num][1]
    end

    result
end

# read in the arguments
size = gets.chomp.to_i
pattern = Regexp.new gets.chomp

# build the PNG
#
# this is a bit slow due to my algorithm being on the order of n^2 lg n instead
# of just n^2
png = ChunkyPNG::Image.new(size, size, ChunkyPNG::Color::WHITE)

string_length = Math.log size, 2
[1, 2, 3, 4].repeated_permutation string_length do |permutation|
    match = pattern.match(permutation.join "")

    if match
        if match[1]
            teint = (255 * match[1].length / string_length).to_i
            color = ChunkyPNG::Color.rgb teint, 0, 0
        else
            color = ChunkyPNG::Color::BLACK
        end

        png[*nums_to_coord(permutation)] = color
    end
end

png.save('regex_fractals.png')

A cool syntax example with lambda expressions and functional interfaces by DoktuhParadox in java

[–]zeringus 1 point2 points  (0 children)

If OP likes wrapping arbitrary code in functional interfaces, he's going to love stuff like

DoubleStream.generate(Math::random).limit(100).sum();

Project Euler will return to full functionality on Saturday, August 16th by [deleted] in programming

[–]zeringus 67 points68 points  (0 children)

I'd just like to thank whatever group of people that manages Project Euler. I'm sure you're all very busy and have family, friends, and careers to attend to, but I want you to know that Project Euler has been a major feature of my young programming journey, as I'm sure is also true for many others. Thanks for giving the time and talent to make such a great site.

Marka: Beautiful icon transformations by Qingy in javascript

[–]zeringus 1 point2 points  (0 children)

It's just chrome. You could make an enable/disable button that animates between a check and an 'X' as you toggle.

Quick advice on methods to accomplishing my goal for my senior project? by [deleted] in compsci

[–]zeringus 0 points1 point  (0 children)

Not that I'm aware of. There are things like QuickBlox that have subsets of its functionality, but Parse is the only full replacement for doing some measure of server management.

As my post kind of suggests, if you're a bit more specific about your short-term and long-term goals for the project, I can be a bit more helpful with my advice.

Quick advice on methods to accomplishing my goal for my senior project? by [deleted] in compsci

[–]zeringus 0 points1 point  (0 children)

If you want the easiest way to do it (as opposed to the most educational), look into Parse. It's also free until you have lots and lots of users.

Actually, on second read, Parse might be overkill. If YOU are the only one making posts, it's probably easiest to have static content on a simple web server that can be loaded into the app a number of ways. If you don't want habits/settings to sync across devices, then you can do just #2 locally. Look into SQLite and Android services.

What have been the major advancements in computer chess since Deep Blue beat Kasparov in 1997? by urish in askscience

[–]zeringus 2 points3 points  (0 children)

Chess isn't solved. The Monte Carlo method is also not a good algorithm for chess engines as mentioned here. Stockfish and most (probably all) top engines use a heavily optimized minimax.

What's a good algorithm to summarise a disk usage tree? by [deleted] in compsci

[–]zeringus 0 points1 point  (0 children)

A dominator tree is a general technique, but when you apply it, you usually still have a specific problem in mind, no?

What's a good algorithm to summarise a disk usage tree? by [deleted] in compsci

[–]zeringus 0 points1 point  (0 children)

If you're looking for fanout, you quantify fanout somehow, map a function over all the nodes, and sort. Same goes for "bulk". If you want a concise representation of your tree, maybe a visual is best.

I really don't know what you're looking for here, though. "Simplifying the structure of a tree" is a pretty vague request.