Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Halp with regex

Name: Anonymous 2008-08-04 3:54

Hey /prog/,
Lets say I have a file with many lines,
each line consists of a very very long random number (string)
Is it possible to write a regular expression to find the longest series of repeated numbers shared by all lines
aka
120123456789948761238746102347106416239847113958719385
217490812741234567890182472103571938547019112413956138756183756183756817356817356
192346112345678978932671658917612746183724681734682173649187245

(I bolded it for readability)

tldr;
how does I finded biggest repeaded series of numbers contained in all lines? (using regex or something one-line'ish)

Name: Anonymous 2008-08-06 14:12

>>38
Hello my name is >>15 and I think it has to be [n,n+1,...,m], like "123456789", as bolded in >>1

Name: Anonymous 2008-08-06 14:12

I man my name is >>30, not >>15

Name: Anonymous 2008-08-06 15:14

Name: Anonymous 2008-08-06 15:14

Have fun doing it for n Strings.

Name: Anonymous 2008-08-06 20:25

>>44
I will, along with my anus.

Name: Anonymous 2008-08-06 21:30

Here's the new and improved Haskell code. It only finds substrings with lengths bigger than 2, but it's orders of magnitude faster than >>15.

import Control.Monad.Instances
import Control.Monad
import Control.Arrow
import Data.Array
import Data.List
import Data.Ord
import System
import Random

--benchMain =
main =
   do [lines, lineLength] <- fmap (map read) getArgs
      putStrLn . longestSubNumber
         =<< (replicateM lines . replicateM lineLength $ randomRIO ('0','9'))

realMain =
--main =
   putStrLn . longestSubNumber . lines =<< getContents

longestSubNumber [firstNum, secondNum] =
   maximumBy (comparing length) $ commonSubNumbers firstNum secondNum

longestSubNumber numbers =
   maximumBy (comparing length)
      $ uncurry (foldr (concatMap . commonSubNumbers))
      $ first (liftM2 commonSubNumbers head last) $ splitAt 2 numbers

commonSubNumbers firstNum secondNum =
   nubBy (flip isInfixOf)
      $ map (map (head numbers !!) . map head) $ reverse
      $ sortByArrayOn length bounds $ filter ((>2) . length)
      $ groupSequences
      $ sortByArrayOn ((.) head . zipWith (-) =<< tail) bounds
      $ sortByArrayOn head bounds
      $ concatMap sequence $ filter (all (not . null)) $ transpose
      $ map (elems . accumArray (flip (:)) [] ('0','9') . flip zip [0..]) numbers
   where
      bounds = ((,) =<< negate) $ maximum $ map length numbers
      numbers = [firstNum, secondNum]

groupSequences matchList =
   snd $ mapAccumL ((.) (uncurry $ flip (,)) . flip splitAt) matchList
       $ liftM2 (:) ((+1) . head) (zipWith (-) =<< tail)
       $ findIndices (uncurry $ (/=) . map (subtract 1))
       $ (zip =<< tail) $ matchList ++ [[]]

sortByArrayOn rankElem bounds list =
   concat $ map reverse $ filter (not . null) $ elems
          $ accumArray (flip (:)) [] bounds 
          $ map ((,) =<< rankElem) list

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List