Class: String

Inherits: Object

Object

String

show all
Defined in: lib/kyanite/string/misc.rb,
lib/kyanite/string/list.rb,
lib/kyanite/string/diff.rb,
lib/kyanite/string/cast.rb,
lib/kyanite/string/chars.rb,
lib/kyanite/string/split.rb,
lib/kyanite/string/nested.rb,
lib/kyanite/string/random.rb,
lib/kyanite/string/include.rb,
lib/kyanite/general/classutils.rb,
lib/kyanite/string/chars_const.rb,
lib/kyanite/enumerable/structure.rb

String Additions

Kyanite definitions: String
Kyanite tests and examples: Cast Chars Diff Database-Helper Miscellaneous Nested Split
Usage: require 'kyanite/string'

Required from Facets String:

shatter(re): Breaks a string up into an array based on a regular expression. Similar to scan, but includes the matches.

Database-Helper (collapse)

- (Array) enum_to_array

Converts MySQL-Enum to Array.
- (String) list_with(elemente, options = {}, &block)

Generates WHERE clause from Array.
- (String) sql_regexp_for_kommaliste

Returns SQL-RegExp for searching in Postgres comma-separated list.

Overlap / Diff (collapse)

- (String) diff(b)

Returns the differencing part of two strings.
- (String) overlap(b)

Returns the mutual part of two strings.
- (Array) overlapdiff(b)

Returns diff and overlap in one array.

Miscellaneous (collapse)

- (Integer) count_words

Counts the number of words.
- (Boolean) include?(input)

Now also accepts an Array as input parameter.
- (false) is_collection?

false -- Defined for all objects: Do I contain multiple objects? String and Range are not considered as collection.
- (String) mgsub(search_and_replace_pairs)

String substitution like gsub, but replaces multible patterns in one turn.

Cast (collapse)

- (String) from_x

Get a char for a hex representation.
- (Integer) to_identifier

Converts a string into the most plausible Identifier.
- (Integer) to_integer

Converts a string to an integer, even if the number was appended to anything.
- (Integer, String) to_integer_optional

Tries to convert a string to an integer.
- (String, Nil) to_nil

Non-empty strings are returned.
- (String) to_x

Get a hex representation for a char.

Clear / Format Text (collapse)

- (Object) mysqlize

Converts a string so that you can recognize with utf8_general_ci compared strings.
- (String) reduce(options = {})

Reduces a rich unicode string to a very limited character set like humans do.
- (String) reduce53(options = {})

Reduces the string to a base53 encoding.
- (String) reduce53!(options = {})

In-place-variant of reduce53.
- (String) reduce94(options = {}) Deprecated.
deprecated
- (Array) to_a

reverse of Array#to_s_utf8.
- (Array) to_array_of_codepoints

reverse of Array#to_s_utf8.
- (Array) to_array_of_hex

Upcase and Downcase with support for special letters like german umlauts (collapse)

- (String) capitalize

Converts the first letter to upcase, also works with special letters like german umlauts.
- (Boolean) capitalized?

Is the first letter upcase? Also works with special letters like german umlauts.
- (String) downcase2

Better downcase: also works with special letters like german umlauts.
- (String) downcase2!

In-place-variant of downcase2.
- (Boolean) downcase?

Is the string downcase? Also works with special letters like german umlauts.
- (String) upcase2

Better upcase: also works with special letters like german umlauts.
- (String) upcase2!

In-place-variant of upcase2.
- (Boolean) upcase?

Is the string upcase? Also works with special letters like german umlauts.

Split (collapse)

- (String) cut(len = 5)

Cuts a string to a maximal length.
- (String) extract(start_regexp, stop_regexp)

Extracts a substring using two regular expressions.
- (String) fixsize(len)

Forces a fixed size.
- (String) nchar(n, replacement = nil)

Returns n characters of the string.
- (Array) split_by_index(idx)

Cuts a string in parts with given length.
- (Array) split_numeric

Separates a string into numeric and alphanumeric parts.
- (String) without_versioninfo

Removes numeric parts and trailing white spaces, hyphens, underscores, and periods.

Nested (collapse)

- (String) anti

Returns the matching opposite bracket.
- (Range) index_bracket(pattern = nil, start = 0, last_found = nil)

Returns the positions of the next bracket pair.
- (String) mask(options = {}, &block)

Applies the block to a hierarchically defined substring of the string.
- (Integer) nestinglevel(pattern = /[{<(\[]/)

Returns the depth of nesting (number of nesting levels).

Random (collapse)

+ (String) random(type = :en, size = 16)

Generates a random string.
- (String) shuffle(separator = //)

Reorder string in random order.
- (String) shuffle!(separator = //)

In-place-variant of shuffle.

Class Utils (collapse)

- (String) camelize(first_letter_in_uppercase = true)

By default, camelize converts strings to UpperCamelCase.
- (Class) constantize

Tries to find a constant with the name specified in the argument string:.
- (String) demodulize

Removes the module part from the expression in the string.
- (Class) to_class

Converts to a class, the reverse of to_classname.
- (String) to_classname

Converts to a class name , the reverse of to_class.
- (String) underscore

The reverse of camelize.

Constant Summary

MYSQL_REPLACES =: [ [/ä/, 'a'], [/ö/, 'o'], [/ü/, 'u'], [/Ä/, 'a'], [/Ö/, 'o'], [/Ü/, 'u'], [/ss/, 'ß'], [/SS/, 'ß'] ]
STRING_RANDOM_BASIS =: string_random_basis

Class Method Details

+ (`String`) random(type = :en, size = 16)

Generates a random string. Example:

String.random( :de, 20)
=> brräbkpßrdirnenshrnh

String.random( :en, 20)
=> euduwmtenohrtnaeewsc

String.random( :password, 20)
=> PzAAGW2ALsbJRnljI6ho

Parameters:

type (Symbol) (defaults to: :en) —

Char base with letter frequency of the random string (:de, :en or :password)
size (Integer) (defaults to: 16) —

Length of the random string

Returns:

(String) —

Random string



121
122
123

# File 'lib/kyanite/string/random.rb', line 121

def String.random( type=:en, size=16)
  (0...size).map { STRING_RANDOM_BASIS[type][Kernel.rand(STRING_RANDOM_BASIS[type].size)] }.join    
end

Instance Method Details

- (`String`) anti

Returns the matching opposite bracket. Examples:

'('.anti          ->  ')'
'{'.anti          ->  '}'
']'.anti          ->  '['
'<hallo>'.anti    ->  '</hallo>'
'</hallo>'.anti   ->  '<hallo>'

See tests and examples here.

Returns:

(String) —

opposite bracket

# File 'lib/kyanite/string/nested.rb', line 25

def anti
  if self.size == 1
    return self.tr('([{<)]}>',')]}>([{<')
  else
    if self =~ /<([^\/].*)>/
      return "</#{$1}>"
    elsif self =~ /<\/(.*)>/
      return "<#{$1}>"        
    end
  end
  return self
end

- (`String`) camelize(first_letter_in_uppercase = true)

By default, camelize converts strings to UpperCamelCase. If the argument to camelize is set to :lower then camelize produces lowerCamelCase.

camelize will also convert '/' to '::' which is useful for converting paths to namespaces.

Examples:

"active_record".camelize                # => "ActiveRecord"
"active_record".camelize(:lower)        # => "activeRecord"
"active_record/errors".camelize         # => "ActiveRecord::Errors"
"active_record/errors".camelize(:lower) # => "activeRecord::Errors"

Returns:

(String)

# File 'lib/kyanite/general/classutils.rb', line 123

def camelize(first_letter_in_uppercase = true)
  if first_letter_in_uppercase
    self.gsub(/\/(.?)/) { "::#{$1.upcase}" }.gsub(/(?:^|_)(.)/) { $1.upcase }
  else
    self.first + camelize(self)[1..-1]
  end
end

- (`String`) capitalize

Converts the first letter to upcase, also works with special letters like german umlauts.

Returns:

(String)



240
241
242

# File 'lib/kyanite/string/chars.rb', line 240

def capitalize
  (slice(0) || '').upcase2 + (slice(1..-1) || '').downcase2
end

- (`Boolean`) capitalized?

Is the first letter upcase? Also works with special letters like german umlauts.

Returns:

(Boolean)



246
247
248

# File 'lib/kyanite/string/chars.rb', line 246

def capitalized?
  self =~ TR_UPCASE_ALL_REGEXP
end

- (`Class`) constantize

Tries to find a constant with the name specified in the argument string:

"Module".constantize     # => Module
"Test::Unit".constantize # => Test::Unit

The name is assumed to be the one of a top-level constant, no matter whether it starts with "::" or not. No lexical context is taken into account:

C = 'outside'
module M
  C = 'inside'
  C               # => 'inside'
  "C".constantize # => 'outside', same as ::C
end

Returns:

(Class)

# File 'lib/kyanite/general/classutils.rb', line 166

def constantize
  unless /\A(?:::)?([A-Z]\w*(?:::[A-Z]\w*)*)\z/ =~ self
    raise NameError, "#{self.inspect} is not a valid constant name!"
  end

  Object.module_eval("::#{$1}", __FILE__, __LINE__)
end

- (`Integer`) count_words

Counts the number of words.

Returns:

(Integer) —

number of words

# File 'lib/kyanite/string/misc.rb', line 20

def count_words
  n = 0
  scan(/\b\S+\b/) { n += 1}
  n
end

- (`String`) cut(len = 5)

Cuts a string to a maximal length. Example.:

'Hello'.cut(3) => 'Hel'

See tests and examples here.

Returns:

(String) —

Substring

# File 'lib/kyanite/string/split.rb', line 61

def cut(len=5)
    return '' if len <= 0
    self[0..len-1]
end

- (`String`) demodulize

Removes the module part from the expression in the string.

Examples:

"ActiveRecord::CoreExtensions::String::Inflections".demodulize # => "Inflections"
"Inflections".demodulize                                       # => "Inflections"

Returns:

(String)



140
141
142

# File 'lib/kyanite/general/classutils.rb', line 140

def demodulize
  self.gsub(/^.*::/, '')
end

- (`String`) diff(b)

Returns the differencing part of two strings. Example:

"Hello darling".diff("Hello") 
=> " darling"

When in doubt, the longest differencing string. If there is still doubt, then self.

See more examples and tests here.

Returns:

(String) —

differencing part

# File 'lib/kyanite/string/diff.rb', line 41

def diff(b)
    return self     if b.nil?
    b = b.to_str
    return ''        if self == b        # kein Unterschied

  a = self 
  a,b = b,a     if a.size >= b.size        # a ist jetzt k?rzer oder gleichlang wie b
  overlap = a.overlap(b)
  return self if overlap == ''
  return b.split(overlap)[1]
end

- (`String`) downcase2

Better downcase: also works with special letters like german umlauts. (If you overwrite downcase you will get strange results if you use Active Support.)

See tests and examples here.

Returns:

(String)



212
213
214

# File 'lib/kyanite/string/chars.rb', line 212

def downcase2 
  self.tr(TR_UPCASE, TR_DOWNCASE).downcase
end

- (`String`) downcase2!

In-place-variant of downcase2.

Returns:

(String)



218
219
220

# File 'lib/kyanite/string/chars.rb', line 218

def downcase2! 
  self.tr!(TR_UPCASE, TR_DOWNCASE).downcase!
end

- (`Boolean`) downcase?

Is the string downcase? Also works with special letters like german umlauts.

Returns:

(Boolean)



256
257
258

# File 'lib/kyanite/string/chars.rb', line 256

def downcase?
  (self == self.upcase2)
end

- (`Array`) enum_to_array

Converts MySQL-Enum to Array.

Returns:

(Array)



84
85
86

# File 'lib/kyanite/string/list.rb', line 84

def enum_to_array
  self[5..-2].gsub("'",'').split(',').collect {|i| [i,i] }
end

- (`String`) extract(start_regexp, stop_regexp)

Extracts a substring using two regular expressions. Example:

string = '<select id="hello"><option value="0">none</option></select>'
string.extract(  /select.*?id="/  ,  '"'  )  =>  'hello'

See tests and examples here.

Parameters:

start_regexp (RegExp, String) —

Start extraction here
stop_regexp (RegExp, String) —

Stop extraction here

Returns:

(String) —

Substring



105
106
107

# File 'lib/kyanite/string/split.rb', line 105

def extract( start_regexp, stop_regexp )
  split(start_regexp)[1].split(stop_regexp)[0]
end

- (`String`) fixsize(len)

Forces a fixed size.

See tests and examples here.

Returns:

(String)

# File 'lib/kyanite/string/split.rb', line 71

def fixsize( len )
  return '' if len <= 0
  if self.size < len
    self.ljust(len)
  else
      self[0..len-1]
  end
end

- (`String`) from_x

Get a char for a hex representation. See also to_x.

Returns:

(String)

# File 'lib/kyanite/string/cast.rb', line 91

def from_x
  str, q, first = '', 0, false
  each_byte { |byte|
    # Our hex chars are 2 bytes wide, so we have to keep track
    # of whether it's the first or the second of the two.
    #
    # NOTE: inject with each_slice(2) would be a natural fit,
    # but it's kind of slow...
    if first = !first
      q = HEX_CHARS.index(byte)
    else
      # Now we got both parts, so let's do the
      # inverse of divmod(16): q * 16 + r
      str << q * 16 + HEX_CHARS.index(byte)
    end
  }
  str
end

- (`Boolean`) include?(input)

Now also accepts an Array as input parameter. The array elements are ORed, i.e. include? is true if old_include? is true for at least one element of the array. All strings include ", [] or nil. Nil does not include anything: +nil.include? => false+

Returns:

(Boolean)

# File 'lib/kyanite/string/include.rb', line 20

def include?(input)
  return true if input.nil?
  return true if input.empty?
  if ( input.respond_to?(:each)  &&  !input.kind_of?(String) )
   input.each do |frag|
      return true if include?(frag)
   end
   false
  else
    old_include?(input)
  end
end

- (`Range`) index_bracket(pattern = nil, start = 0, last_found = nil)

Returns the positions of the next bracket pair. Example:

'Hello(welt)wort'.index_bracket  ->  5..10

See tests and examples here.

Parameters:

start (Integer) (defaults to: 0) —

Search from this starting position
pattern (RegExp, String) (defaults to: nil) —

Search only this type of bracket

Returns:

(Range) —

Positions of brackets

# File 'lib/kyanite/string/nested.rb', line 47

def index_bracket( pattern=nil, start=0, last_found = nil )
  return nil if self.empty?
  pattern = /[{<\[]/ unless pattern    
  # pattern = /['"({<(\[]/  unless pattern 
  debug = false
  puts 'untersuche ' + self[start..-1]     if debug
  found = self.index(pattern, start) 
  puts "found=#{found}"                    if debug
  return last_found unless found
  pattern_anti = self[found..found].anti
  startpunkt = found
  loop do
    found_next =  self.index( pattern,      startpunkt+1 ) || 9999999
    found_anti =  self.index( pattern_anti, startpunkt+1 ) 
    puts "found_next=#{found_next}"        if debug
    puts "found_anti=#{found_anti}"        if debug
    break unless found_anti
    return found..found_anti   if found_anti <= found_next
    # puts
    # puts
    # puts
    # puts
    # puts
    # puts "start=#{(start).inspect_pp}"
    # puts "pattern=#{(pattern).inspect_pp}"
    # puts "found_next=#{(found_next).inspect_pp}"
    # puts "found..found_anti=#{(found..found_anti).inspect_pp}"
    rekursiv_result = self.index_bracket(pattern, found_next, found..found_anti)
    return found..found_anti unless rekursiv_result
    startpunkt = rekursiv_result.last
    puts "startpunkt=#{startpunkt}"        if debug
  end # loop
  nil
end

- (`false`) is_collection?

false -- Defined for all objects: Do I contain multiple objects? String and Range are not considered as collection.

Tests and examples here.

Returns:

(false)

48	# File 'lib/kyanite/enumerable/structure.rb', line 48 def is_collection?; false; end

- (`String`) list_with(elemente, options = {}, &block)

Generates WHERE clause from Array.

Example:

array = ['Anna','Birte','Charlie']
"kisses_from = ".list_with(array)
=> "kisses_from = 'Anna' OR kisses_from = 'Birte' OR kisses_from = 'Charlie'"

See tests and more examples here.

Returns:

(String)

# File 'lib/kyanite/string/list.rb', line 26

def list_with(  elemente, options = {}, &block )

  options = { :pre    => %q{'},
              :post   => %q{'},
              :sep    => ' OR ',
              :empty => 'FALSE'}.merge(options)

  
  # keine Liste angegeben
  return options[:empty]     if elemente.empty? 
  
  
  # einzelnen String oder einzelnes Symbol angegeben -> ohne Separator ausgeben
  if elemente.kind_of?(String)  ||  !elemente.respond_to?(:each_index) 
    e = elemente.dup          
    e = yield e               if block_given?    
    return "#{self}#{options[:pre]}#{e}#{options[:post]}"     
  end
  
  
  # Liste hat nur ein Element -> ohne Separator ausgeben
  if elemente.size <= 1  
    e = elemente[0].dup
    e = yield e               if block_given?    
    return "#{self}#{options[:pre]}#{e}#{options[:post]}"     
  end      
  
  
  # Liste hat mehrere Elemente
  result = ''
  elemente[0..-2].each do |e|
    # Die vorderen Elemente mit Separator
    e = yield e               if block_given?
    result += "#{self}#{options[:pre]}#{e}#{options[:post]}#{options[:sep]}"  
  end
    # Letztes Element ohne Separator 
    e = elemente[-1].dup
    e = yield elemente[-1]    if block_given?
    result += "#{self}#{options[:pre]}#{e}#{options[:post]}" 
  result
end

- (`String`) mask(options = {}, &block)

Applies the block to a hierarchically defined substring of the string.

See tests and examples here.

Parameters:

options (Hash) (defaults to: {})
block (Block)

Options Hash (options):

:level_start (Integer)
:level_end (Integer)
:level_akt (Integer)
:pattern (RegExp, String)
:skip_empty (Boolean)
:param_level (Boolean)
:with_brackets (Boolean)

Returns:

(String)

Raises:

(ArgumentError)

# File 'lib/kyanite/string/nested.rb', line 97

def mask( options={}, &block )

  # vorbereiten
  debug = false
  result = self.dup
  
  level_start =   options[:level_start]   || 1
  level_end =     options[:level_end]     || 99999
  level_akt =     options[:level_akt]     || 0
  #level_akt += 1 if with_brackets    
  pattern =       options[:pattern]       || /[{<\[]/ # /['"({<\[]/ ist langsam
  skip_empty =    options[:skip_empty]    || false
  param_level =   options[:param_level]   || false    # übergibt dem Block zusätzlich die Nummer des aktuellen Levels
  with_brackets = options[:with_brackets] || false  # übergibt dem Block auch die Brackets, Beispiel siehe Tests!!
  
  raise ArgumentError, "level_start can't be nil"             unless level_start 
  raise ArgumentError, "level_end can't be nil"               unless level_end 
  raise ArgumentError, 'level_end has to be >= level_start'   unless level_end >= level_start
  if debug 
    puts "level_start=#{level_start}"   
    puts "level_end=#{level_end}"   
    puts "level_akt=#{level_akt}"    
    puts
  end          
  geklammert = result.index_bracket(pattern)
  puts "geklammert=#{geklammert}"  if debug
  
  
  # Los geht's: geklammert, Klammern werden nicht mit übergeben 
  if geklammert 
    if !with_brackets
      if geklammert.first > 0
        pre =   result[0..geklammert.first-1]
      else
        pre =   ''
      end
      bra =     result[geklammert.first..geklammert.first]
      mid =     result[geklammert.first+1..geklammert.last-1]
      ket =     result[geklammert.last..geklammert.last]
      if geklammert.last < (result.size-1)
        past =  result[geklammert.last+1..-1]
      else
        past =  ''
      end  
    else # with_brackets
      if geklammert.first > 0
        pre =   result[0..geklammert.first]
      else
        pre =   result[geklammert.first..geklammert.first]
      end
      bra =     ''
      mid =     result[geklammert.first+1..geklammert.last-1]
      ket =     ''
      if geklammert.last < (result.size-1)
        past =  result[geklammert.last..-1]
      else
        past =  result[geklammert.last..geklammert.last]
      end        
    end
    if debug 
      puts "1pre=#{pre}"   
      puts "1bra=#{bra}"   
      puts "1mid=#{mid}"   
      puts "1ket=#{ket}"   
      puts "1past=#{past}"   
      puts
    end     
    
    # yield
    if ( (level_start..level_end)  === level_akt  &&  (!pre.empty? || !skip_empty) )     
      if param_level
        pre =  yield(pre,level_akt)   
      else
        pre =  yield(pre) 
      end
    end # if yield
    mid =  mid.mask(  options.merge({:level_akt => level_akt+1}), &block )
    past = past.mask( options, &block )
    if debug 
      puts "2pre=#{pre}"   
      puts "2bra=#{bra}"   
      puts "2mid=#{mid}"   
      puts "2ket=#{ket}"   
      puts "2past=#{past}"   
      puts
    end
    
    return (pre||'') + bra + (mid||'') + ket + (past||'')

    
  
  # Los geht's: keine Klammern
  else
    # yield
    
    if ( (level_start..level_end)  === level_akt  &&  (!result.empty? || !skip_empty ) ) 

      puts "result=#{result}\n" if debug
      if param_level
        result =  yield(result,level_akt)   
      else
        result=  yield(result) 
      end      
    end # if yield
  return (result||'')       
  end      

  raise 'no go'
    
end

- (`String`) mgsub(search_and_replace_pairs)

String substitution like gsub, but replaces multible patterns in one turn. Example:

"between".mgsub([[/ee/, 'II'], [/e/, 'E']])      
=> "bEtwIIn"

Tests here.

Returns:

(String)

# File 'lib/kyanite/string/misc.rb', line 32

def mgsub(search_and_replace_pairs)
  patterns = search_and_replace_pairs.collect { |search, replace| search }
  gsub(Regexp.union(*patterns)) do |match|
    search_and_replace_pairs.detect{ |search, replace| search =~ match}[1]
  end	
end

- (`Object`) mysqlize

Converts a string so that you can recognize with utf8_general_ci compared strings



184
185
186

# File 'lib/kyanite/string/chars.rb', line 184

def mysqlize
  self.mgsub(MYSQL_REPLACES).downcase.to_s
end

- (`String`) nchar(n, replacement = nil)

Returns n characters of the string. If n is positive the characters are from the beginning of the string. If n is negative from the end of the string.

Alternatively a replacement string can be given, which will replace the n characters. Example:

'abcde'.nchar(1)  =>      'a'
'abcde'.nchar(2)  =>      'ab'
'abcde'.nchar(3)  =>      'abc'
'abcde'.nchar(2,'')  =>   'cde'

(The originaly version of this method is from the Facets library). See tests and examples here.

Returns:

(String) —

Substring

# File 'lib/kyanite/string/split.rb', line 39

def nchar(n, replacement=nil)
  if replacement
    return self   if n == 0     
    return ''     if (n.abs >= self.length)      
    s = self.dup
    n > 0 ? (s[0...n] = replacement) : (s[n..-1] = replacement)
    return s
    
  # ohne replacement  
  else
    return ''   if n == 0
    return self if (n.abs > self.length)
    n > 0 ? self[0...n] : self[n..-1]
  end
end

- (`Integer`) nestinglevel(pattern = /[{<(\[]/)

Returns the depth of nesting (number of nesting levels).

Parameters:

pattern (RegExp, String) (defaults to: /[{<(\[]/) —

Search only this type of bracket

Returns:

(Integer) —

Depth of nesting

# File 'lib/kyanite/string/nested.rb', line 213

def nestinglevel(pattern=/[{<(\[]/)
  result = 0
  self.mask( :level_start => 0,
             :pattern => pattern,
             :param_level => true ) { |s,l| 
             if l > result
              result = l
              s
             else
             s
             end
             }
  result
end

- (`String`) overlap(b)

Returns the mutual part of two strings. Example:

"Hello world".overlap("Hello darling") 
=> "Hello"

See more examples and tests here.

Returns:

(String) —

mutual part

# File 'lib/kyanite/string/diff.rb', line 21

def overlap(b)
    return ''         if b.nil?
    b = b.to_str
    return self     if self == b
    return ''         if self[0] != b[0]

    n = [self.size, b.size].min
    (0..n).each { |i|  return self[0..i-1] unless self[i] == b[i] }
end

- (`Array`) overlapdiff(b)

Returns diff and overlap in one array. Takes as much time as diff alone.

Symmetry: If we add overlap + diff, we always get the longest of the two original strings. If both had the same length, we get self.

See examples and tests here.

Returns:

(Array) —

mutual part, differencing part

# File 'lib/kyanite/string/diff.rb', line 66

def overlapdiff(b)
    return '', self     if b.nil?
    b = b.to_str
    return self,''        if self == b        # kein Unterschied

  a = self 
  a,b = b,a     if a.size >= b.size        # a ist jetzt k?rzer oder gleichlang wie b
  overlap = a.overlap(b)
  return overlap, self if overlap == ''
  return overlap, b.split(overlap)[1]
end

- (`String`) reduce(options = {})

Reduces a rich unicode string to a very limited character set like humans do. Example:

"Céline hören".reduce
=> "Celine hoeren"

Handles all characters from ISO/IEC 8859-1 and CP1252 like humans do, not just deleting the accents. So it’s not a 1:1 translation, some unicode characters are translated to multible characters. Example:

"ÄÖÜäöüß".reduce
=> "AeOeUeaeoeuess"

For many unicode characters, this behaviour is based on UnicodeUtils.nfkd. Example:

ffi = "\uFB03"
ix = "\u2168"
high23="²³"
high5 = "\u2075"
all = ffi + ix + high23 + high5 
all.reduce
=> "ffiIX235"

You can preserve some characters, e.g. all special characters of a specific language. Example:

"Céline hören 10€".reduce( :preserve => "ÄÖÜäöüß")
=> "Celine hören 10EUR"

Newlines are preserved by default, but all other nonprintable ascii characters below \x20 are removed.

There is also a fast mode. It’s about 10 times faster, but it supports only 1:1 translation.

"Céline hören 10€".reduce( :preserve => "ÄÖÜäöüß€", :fast => true )
=> "Celine hören 10€"   

"ÄÖÜäöüß€".reduce( :fast => true ) 
=> "AOUaous"

Your result will only contain these characters:

printable letters and basic symbols of the 7bit ASCII charset (\x20..\x7e)
preserved characters as defined in the options (max 18)
newlines (\x0a and \x0d)

Options:

:preserve: Special characters to preserve. You can only preserve up to 18 characters.
:fast: Fast mode, if true. About 10 times faster, but it supports only 1:1 translation.

Returns:

(String)

Raises:

(ArgumentError)

# File 'lib/kyanite/string/chars.rb', line 91

def reduce( options ={} )
  preserve = options[:preserve] || ''
  raise ArgumentError, 'max preserve string length is 18 chars'   if preserve.length > 18 
  
  result = self.delete("\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0b\x0c\x0e-\x1f") 
  result.tr!(preserve, "\x0e-\x1f")                               if preserve.length > 0
  
  result = result.to_ascii_extra_chars                            unless options[:fast]
  result.tr!(TR_FULL, TR_REDUCED)     
  result = UnicodeUtils.nfkd(result)                              unless options[:fast]
  
  result.delete!("^\x09-\x7e")          
  result.tr!("\x0e-\x1f", preserve)                               if preserve.length > 0    
  result
end

- (`String`) reduce53(options = {})

Reduces the string to a base53 encoding. The result consists only uppercase letters, minus, and lowercase characters as replacement for some known special characters.

Removes all non-letter-chars.
Converts all regular letters to upcase letters.
Converts special letters to reduced downcase letters, eg. àáâăäãāåạąæảấầắằÀÁÂĂÄÃĀÅẠĄÆẢẤẦẮẰ etc. to aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.
Caution: Also Newlines are removed.

See tests and examples here.

Returns:

(String)



135
136
137

# File 'lib/kyanite/string/chars.rb', line 135

def reduce53( options={} )
  dup.reduce53!(options)
end

- (`String`) reduce53!(options = {})

In-place-variant of reduce53.

Returns:

(String)

# File 'lib/kyanite/string/chars.rb', line 142

def reduce53!( options={} )

  if options[:camelcase]  
    self.gsub!(/([A-Z]+)([A-Z][a-z])/,'\1-\2')
    self.gsub!(/([a-z\d])([A-Z])/,'\1-\2')
  end
  
  self.gsub!( 'ß', options[:german_sz] )        if options[:german_sz]        
  self.tr!('abcdefghijklmnopqrstuvwxyz§', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ ') 
  
  self.tr!(TR_FULL, TR_REDUCED.downcase)  
  unless options[:space]
    self.delete!('^- A-Za-z')
  else
    self.tr!('^- A-Za-z', ' ')      
  end
  self.gsub!(/-+/,  ' ')          
  self.gsub!(/\s+/, ' ')      
  self.strip!   
  self.gsub!(/ /,   '-')           
  self
end

- (`String`) reduce94(options = {})

Deprecated.

Returns:

(String)



111
112
113

# File 'lib/kyanite/string/chars.rb', line 111

def reduce94( options={} )
  reduce(  {:fast => true}.merge(options)  )
end

- (`String`) shuffle(separator = //)

Reorder string in random order. Example:

"Random order".shuffle
=> "oeo rdRdnmar"

Returns:

(String)



97
98
99

# File 'lib/kyanite/string/random.rb', line 97

def shuffle(separator=//)
  split(separator).shuffle.join('')
end

- (`String`) shuffle!(separator = //)

In-place-variant of shuffle.

Returns:

(String)



103
104
105

# File 'lib/kyanite/string/random.rb', line 103

def shuffle!(separator=//)
  self.replace( shuffle(separator) )
end

- (`Array`) split_by_index(idx)

Cuts a string in parts with given length. See tests and examples here.

Parameters:

idx (Integer, Array of Integer) —

Length of parts

Returns:

(Array) —

All the parts with given length, plus remainder.

# File 'lib/kyanite/string/split.rb', line 85

def split_by_index(idx)     
  if idx.kind_of?(Integer)
   [nchar(idx)] + [nchar(idx,'')]
    
  elsif idx.kind_of?(Array)
   [nchar(idx[0])] + nchar(idx[0],'').split_by_index(idx.shift_complement)           
    
  end # if
end

- (`Array`) split_numeric

Separates a string into numeric and alphanumeric parts. Currently works only with positive integers. Example:

'abc123'.split_numeric  >>  ['abc',123]   (Array)
'123abc'.split_numeric  >>  [123,'abc']   (Array)
'123'.split_numeric     >>  123           (Integer)
'abc'.split_numeric     >>  'abc'         (String)

It even works with more than two parts:

'123abc456'.split_numeric  >>  [123,'abc',456]
'abc123def'.split_numeric  >>  ['abc',123,'def']

See tests and examples here.

Returns:

(Array) —

alphanumeric and numeric part

# File 'lib/kyanite/string/split.rb', line 126

def split_numeric
    result = shatter(/\d+/).collect{ |i| i.to_integer_optional }
    return result[0]    if ( result.is_collection? && result.size == 1 )
    return result
end

- (`String`) sql_regexp_for_kommaliste

Returns SQL-RegExp for searching in Postgres comma-separated list.

Returns:

(String)

# File 'lib/kyanite/string/list.rb', line 73

def sql_regexp_for_kommaliste
   '[, ]'  + self  + '[, ]'          + '|' +       # match mittendrin
   '^'     + self  + '[, ]'          + '|' +       # match am Anfang
   '[, ]'  + self  + '$'             + '|' +       # match am Ende
   '^'     + self  + '$'                           # match von Anfang bis Ende
end

- (`Array`) to_a

reverse of Array#to_s_utf8

Returns:

(Array)

# File 'lib/kyanite/string/chars.rb', line 27

def to_a
  result = []
  self.each_char do |c|
  result << c
  end
  result
end

- (`Array`) to_array_of_codepoints

reverse of Array#to_s_utf8

Returns:

(Array)



38
39
40

# File 'lib/kyanite/string/chars.rb', line 38

def to_array_of_codepoints
  self.codepoints.to_a
end

- (`Array`) to_array_of_hex

Returns:

(Array)



43
44
45

# File 'lib/kyanite/string/chars.rb', line 43

def to_array_of_hex
  self.unpack('U'*self.length).collect {|x| x.to_s 16}
end

- (`Class`) to_class

Converts to a class, the reverse of to_classname

Defined for classes Class, Symbol, String. Accepts both CamelCase and down_case.

Tests and examples here.

Returns:

(Class)

# File 'lib/kyanite/general/classutils.rb', line 82

def to_class
    self.camelize.constantize
rescue
    return nil
end

- (`String`) to_classname

Converts to a class name , the reverse of to_class.

classes Class, Symbol, String. The class name will contain only lowercase letters.

'MyModul::MyClass'  =>  'my_class'

Tests and examples here.

Returns:

(String)



70
71
72

# File 'lib/kyanite/general/classutils.rb', line 70

def to_classname
    self.demodulize.underscore
end

- (`Integer`) to_identifier

Converts a string into the most plausible Identifier

See examples and tests here.

Returns:

(Integer)



22
23
24

# File 'lib/kyanite/string/cast.rb', line 22

def to_identifier
  self.strip.to_integer_optional
end

- (`Integer`) to_integer

Converts a string to an integer, even if the number was appended to anything. Unlike to_i it returns nil if no integer was found inside the string.

See examples and tests here.

Returns:

(Integer)

# File 'lib/kyanite/string/cast.rb', line 33

def to_integer
    return nil                 unless self =~ /\d/
    firsttry = self.to_i
    return firsttry         if firsttry != 0
    return self.scan(/\d+/)[0].to_i 
end

- (`Integer`, `String`) to_integer_optional

Tries to convert a string to an integer. Returns self if the string does not start with a number. Empty strings are converted to nil.

See examples and tests here.

Returns:

(Integer, String)

# File 'lib/kyanite/string/cast.rb', line 48

def to_integer_optional
    return nil                  if self.empty?
    return self                 unless (self =~ /^\d/  || self =~ /^-\d/ )
    return self.to_i
end

- (`String`, `Nil`) to_nil

Non-empty strings are returned. Empty strings are converted to nil.

Returns:

(String, Nil)

# File 'lib/kyanite/string/cast.rb', line 58

def to_nil
  return self unless self.empty?
  nil
end

- (`String`) to_x

Get a hex representation for a char. See also from_x.

Returns:

(String)

# File 'lib/kyanite/string/cast.rb', line 75

def to_x
  hex = ''
  each_byte { |byte|
    # To get a hex representation for a char we just utilize
    # the quotient and the remainder of division by base 16.
    q, r = byte.divmod(16)
    hex << HEX_CHARS[q] << HEX_CHARS[r]
  }
  hex
end

- (`String`) underscore

The reverse of camelize. Makes an underscored, lowercase form from the expression in the string. Changes '::' to '/' to convert namespaces to paths.

Examples:

"ActiveRecord".underscore         # => "active_record"
"ActiveRecord::Errors".underscore # => active_record/errors

Returns:

(String)

# File 'lib/kyanite/general/classutils.rb', line 100

def underscore
  self.gsub(/::/, '/').
    gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
    gsub(/([a-z\d])([A-Z])/,'\1_\2').
    tr("-", "_").
    downcase
end

- (`String`) upcase2

Better upcase: also works with special letters like german umlauts. (If you overwrite upcase you will get strange results if you use Active Support.)

See tests and examples here.

Returns:

(String)



228
229
230

# File 'lib/kyanite/string/chars.rb', line 228

def upcase2
  self.tr(TR_DOWNCASE, TR_UPCASE).upcase
end

- (`String`) upcase2!

In-place-variant of upcase2.

Returns:

(String)



234
235
236

# File 'lib/kyanite/string/chars.rb', line 234

def upcase2!
  self.tr!(TR_DOWNCASE, TR_UPCASE).upcase!
end

- (`Boolean`) upcase?

Is the string upcase? Also works with special letters like german umlauts.

Returns:

(Boolean)



251
252
253

# File 'lib/kyanite/string/chars.rb', line 251

def upcase?
  (self == self.upcase2)
end

- (`String`) without_versioninfo

Removes numeric parts and trailing white spaces, hyphens, underscores, and periods. See tests and examples here.

Returns:

(String) —

Substring



137
138
139

# File 'lib/kyanite/string/split.rb', line 137

def without_versioninfo
    shatter(/\d+/)[0].strip.chomp('_').chomp('-').chomp('.')
end

Class: String

String Additions

Required from Facets String:

Database-Helper (collapse)

Overlap / Diff (collapse)

Miscellaneous (collapse)

Cast (collapse)

Clear / Format Text (collapse)

Upcase and Downcase with support for special letters like german umlauts (collapse)

Split (collapse)

Nested (collapse)

Random (collapse)

Class Utils (collapse)

Constant Summary

Class Method Details

+ (String) random(type = :en, size = 16)

Instance Method Details

- (String) anti

- (String) camelize(first_letter_in_uppercase = true)

- (String) capitalize

- (Boolean) capitalized?

- (Class) constantize

- (Integer) count_words

- (String) cut(len = 5)

- (String) demodulize

- (String) diff(b)

- (String) downcase2

- (String) downcase2!

- (Boolean) downcase?

- (Array) enum_to_array

- (String) extract(start_regexp, stop_regexp)

- (String) fixsize(len)

- (String) from_x

- (Boolean) include?(input)

- (Range) index_bracket(pattern = nil, start = 0, last_found = nil)

- (false) is_collection?

- (String) list_with(elemente, options = {}, &block)

- (String) mask(options = {}, &block)

- (String) mgsub(search_and_replace_pairs)

- (Object) mysqlize

- (String) nchar(n, replacement = nil)

- (Integer) nestinglevel(pattern = /[{<(\[]/)

- (String) overlap(b)

- (Array) overlapdiff(b)

- (String) reduce(options = {})

- (String) reduce53(options = {})

- (String) reduce53!(options = {})

- (String) reduce94(options = {})

- (String) shuffle(separator = //)

- (String) shuffle!(separator = //)

- (Array) split_by_index(idx)

- (Array) split_numeric

- (String) sql_regexp_for_kommaliste

- (Array) to_a

- (Array) to_array_of_codepoints

- (Array) to_array_of_hex

- (Class) to_class

- (String) to_classname

- (Integer) to_identifier

- (Integer) to_integer

- (Integer, String) to_integer_optional

- (String, Nil) to_nil

- (String) to_x

- (String) underscore

- (String) upcase2

- (String) upcase2!

- (Boolean) upcase?

- (String) without_versioninfo

+ (`String`) random(type = :en, size = 16)

- (`String`) anti

- (`String`) camelize(first_letter_in_uppercase = true)

- (`String`) capitalize

- (`Boolean`) capitalized?

- (`Class`) constantize

- (`Integer`) count_words

- (`String`) cut(len = 5)

- (`String`) demodulize

- (`String`) diff(b)

- (`String`) downcase2

- (`String`) downcase2!

- (`Boolean`) downcase?

- (`Array`) enum_to_array

- (`String`) extract(start_regexp, stop_regexp)

- (`String`) fixsize(len)

- (`String`) from_x

- (`Boolean`) include?(input)

- (`Range`) index_bracket(pattern = nil, start = 0, last_found = nil)

- (`false`) is_collection?

- (`String`) list_with(elemente, options = {}, &block)

- (`String`) mask(options = {}, &block)

- (`String`) mgsub(search_and_replace_pairs)

- (`Object`) mysqlize

- (`String`) nchar(n, replacement = nil)

- (`Integer`) nestinglevel(pattern = /[{<(\[]/)

- (`String`) overlap(b)

- (`Array`) overlapdiff(b)

- (`String`) reduce(options = {})

- (`String`) reduce53(options = {})

- (`String`) reduce53!(options = {})

- (`String`) reduce94(options = {})

- (`String`) shuffle(separator = //)

- (`String`) shuffle!(separator = //)

- (`Array`) split_by_index(idx)

- (`Array`) split_numeric

- (`String`) sql_regexp_for_kommaliste

- (`Array`) to_a

- (`Array`) to_array_of_codepoints

- (`Array`) to_array_of_hex

- (`Class`) to_class

- (`String`) to_classname

- (`Integer`) to_identifier

- (`Integer`) to_integer

- (`Integer`, `String`) to_integer_optional

- (`String`, `Nil`) to_nil

- (`String`) to_x

- (`String`) underscore

- (`String`) upcase2

- (`String`) upcase2!

- (`Boolean`) upcase?

- (`String`) without_versioninfo