Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,311
    • Issues 4,311
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 382
    • Merge Requests 382
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #4086

Closed
Open
Opened May 21, 2010 by Pete@trac-Pete

Data.List 'nub' function is O(n^2)

I recently discovered that some Haskell code was running much slower than I would have expected. I eventually traced the problem to the 'nub' function in the Data.List, which ghc implements as follows:

nub l                   = nub' l []
  where
    nub' [] _           = []
    nub' (x:xs) ls
        | x `elem` ls   = nub' xs ls
        | otherwise     = x : nub' xs (x:ls)

This would seem to be O(n**2), because it accumulates the values it sees in a list. If it used a different data structure like a Set, it could be made O(n log n).

I'm not sure whether this should be considered a bug or not. The list nub returns is correct. On the other hand, when I call a library function in any programming language, I would normally expect it to use an algorithm that provides the best achievable asymptotic performance.

If you decide that you don't consider this to be a bug, can I suggest adding a note to the documentation, so people are aware that nub should only be used with short lists?

Edited Mar 09, 2019 by Ian Lynagh <igloo@earth.li>
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#4086