Do you ever find yourself doing this?
tags = %w[foo bar baz] tags << 'buz' unless tags.include?('buz')
Or:
tags << 'baz' tags.uniq!
In both cases, we have an Array
we want to use as a set, containing only unique elements.
One way to tackle this more cleanly is to simply use a Set
.
require 'set' tags = Set.new(%w[foo bar baz]) tags.add('foo') tags.add('buz') tags # => #<Set: {"foo", "bar", "baz", "buz"}>
But the Set
and Array
interfaces differ in some regards, and if other code is already expecting the collection to be an Array
, that solution may not be practical.
As it happens, Array
supports several basic set operations innately. You may already know about these, but in case you don’t, here are some examples.
Set union:
tags = %w[foo bar] tags |= %w[foo buz] # => ["foo", "bar", "buz"]
Set difference:
tags = %w[foo bar] tags - %w[bar baz] # => ["foo"]
Set intersection:
tags = %w[foo bar] tags & %w[bar baz] # => ["bar"]
It’s a small thing, but perhaps it will save you a few lines of code.
UPDATE: My WordPress “related posts” feature points out that I have officially begun to repeat myself. Ah well. If nothing else this article has a bit more explanation than the one from 2010.
I also love using sets in testing when I want to compare two arrays but I don’t care about order. Clearer than sorting them.
In RSpec, a matcher =~ is provided for comparing arrays without consideration to order. It’s quite helpful. Not saying you do or should use RSpec; it just seems to be a lesser known matcher so sharing in case anybody finds it useful.
The Set trick is neat too, I will keep that in mind!
Hey, cool!
Yeah, that’s neat. Kind of an odd overloading, but I guess “matches” makes sense for the arrays.
FWIW, the recommended way to match an array w/o order is:
expect(array).to match_array(other_array)
The
expect
syntax is the recommend syntax to use with matchers now, and it intentionally does not support operator matchers. Read this for more info:http://myronmars.to/n/dev-blog/2012/06/rspecs-new-expectation-syntax
Thanks for the tip. That definitely reads more explicitly.
Good stuff. I use sets all the time, not simply because I want set semantics, but also because
Set#include?
is O(1) andArray#include?
is O(N). When you’re checking membership in a collection in a type loop, using a set rather than an array can make a big difference. Last week I optimized a method that was taking over 20 minutes to run (w/o doing any IO) down to 16 seconds by changing an array to a set.“My WordPress “related posts” feature points out that I have officially begun to repeat myself.”
Well, you need to store your posts in a Set, not an Array!
In the third example you can do &= to set the variable to the intersected result.
You can similarly do -= like the second example. 🙂
Indeed a very good post. I wasn’t familiar with the union functionality, and I can really see use cases where it might come handy.
It’s interesting that I haven’t seen much code using Set. I wonder why it is so? Is just because people isn’t familiar with it?
Hey Avdi, I have a small gem that encompasses set operations on Array https://github.com/tehpeh/set_theory
Cheers!
As of r36853[1], Set has , = operators.
1: https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/36853
Great, it is intersting