Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler/pathutils] AbsoluteFile,AbsoluteDir,RelativeFile,RelativeDir is too complicated #71

Closed
timotheecour opened this issue Oct 25, 2018 · 3 comments · Fixed by nim-lang/Nim#20582

Comments

@timotheecour
Copy link
Member

timotheecour commented Oct 25, 2018

now that there's a talk of starting to integrate compiler/pathutils inside ospaths to allow reusing its relativeTo in ospaths (as suggested here: nim-lang/Nim#8166 (comment)) it's time to look into design decisions more carefully.
I actually had a WIP PR that moved things around to allow for that (see branch https://github.com/timotheecour/Nim/tree/pr_ospaths_pathutils [it's WIP]) but I don't like the resulting API's.

let me denote by strongly typed paths the types introduced in compiler/pathutils:

type
  AbsoluteFile* = distinct string
  AbsoluteDir* = distinct string
  RelativeFile* = distinct string
  RelativeDir* = distinct string

Let's contrast this with following alternative, dubbed simply typed paths:

type
  Path* = distinct string

IMO, distinguishing (what's more, at compile time) between these 4 variants doesn't pull its weight, and results in a lot of procs being duplicated. It's already the case in pathutils, and, if we apply same logic to rest of procs in ospaths, the problem is only gonna get worse.

problem 1: dealing with paths of unknown type

  • suppose an application deals with some paths of unknown origin (paths could be either absolute or relative, files or dirs) where we don't care which category these belong to; there's no simple way to express these in terms of the strongly typed paths, nor to use strongly typed API's ; this forces user code to be generic, which is "contagious", what would've started as non-generic code has to become generic, adding complexity

  • another example: isAbsolute itself can't be written using strongly typed paths; it has to be written as isAbsolute(string), losing the benefit of type safety + normalization we get by simply typed paths

problem 2: very little benefit from distinguishing between those

eg:

  • lots of cases where the path type makes 0 difference:
  proc `==`*(x, y: AbsoluteFile): bool = eqImpl(x.string, y.string)
  proc `==`*(x, y: AbsoluteDir): bool = eqImpl(x.string, y.string)
  proc `==`*(x, y: RelativeFile): bool = eqImpl(x.string, y.string)
  proc `==`*(x, y: RelativeDir): bool = eqImpl(x.string, y.string)

proc isEmpty*(x: AbsoluteFile): bool {.inline.} = x.string.len == 0
proc isEmpty*(x: AbsoluteDir): bool {.inline.} = x.string.len == 0
proc isEmpty*(x: RelativeFile): bool {.inline.} = x.string.len == 0
proc isEmpty*(x: RelativeDir): bool {.inline.} = x.string.len == 0

likewise with unixToNativePath, and many other procs in ospaths

  • lots of cases where the path type makes little difference:
    eg: joinPaths(path1, path2) should work with:
    (AbsoluteDir | RelativeDir, AbsoluteFile | AbsoluteDir | RelativeFile | RelativeDir)
    so we end up with either generics or with very redundant proc overloads; the only advantage of type safety we get here is preventing 1st arg path1 to be a File; this doesn't seem like a good tradeoff

problem 3: not clear how to represent / provide API's for symlinks, filenames etc

is a filename (aka basename) just a string? or yet another distinct string? or a RelativeFile ? if it's just a string (the most likely option), we lose the benefits of type safety from hereafter for the resulting basename, eg:

let file = RelativeFile("foo1/foo2/bar.txt")
let filename = extractFileName(file)
let dir: RelativeDir = dirName(dirName(file))
let file2 = dir / filename # error: 2nd arg should be a `AbsoluteFile | AbsoluteDir | RelativeFile | RelativeDir`; not a `string`

[EDIT] problem 4: the type safety guarantees are not actual guarantees, and often violated in practice

the guarantees offered by pathutils.AbsoluteFile + friends are rather moot because code like AbsoluteFile(file) makes 0 checking (nor can it since it's an object constructor).
In practice these "guarantees" are often violated, eg see:

alternative A1 (simplest)

  • just use a single type:
type
  Path* = distinct string

Path could represent a normalized path, it'd be normalized during construction, eg:

let a=Path"/abc//def/" # calls `normalize` (aka canon)
doAssert a.string == "/abc/def"

# partial type safety
proc write(path: Path, contents: string) # writeFile is implicit; and no possible confusion between path and contents
# still need to distinguish these:
proc removeFile(file: Path) 
proc removeDir(dir: Path) 

alternative A2 (typesafe and flexible)

use a class hierarchy

type
  Path* = object of RootObj
    s: string
  PathFile* = object of Path
  PathDir* = object of Path

template p(s: string): Path = Path(s: s.normalize)
template `$`(path: Path): string = path.s
let path = p"abc//def/"
echo path # "abc/def

# full type safety:
proc write(path: PathFile, contents: string)
proc remove(file: PathFile) 
proc remove(dir: PathDir) 

Note: I'm not distinguishing between absolute vs relative here; IMO it doesn't pull its weight.
API's shall use either Path, PathFile, or PathDir as appropriate in proc definitions.

We still get benefits of type safety at compile time where we need to distinguish, as well ability to use paths of unknown kind (a path, which can be a file or dir)

One thing to consider is whether this leads to runtime performance issues (due to dynamic dispatch) of practical impact. If it doesn't, this may be the best option.

@Araq
Copy link
Member

Araq commented Oct 25, 2018

type
  Path* = distinct string

is not good enough for the Nim compiler and so it's not typesafe enough for other code bases too. In particular, filenames and directories are completely different beasts.

problem 1: dealing with paths of unknown type

That's not a problem, that's a feature, you need to validate/check your inputs.

lots of cases where the path type makes 0 difference:

Wrong, I cannot even compare filenames with paths, that's a feature too.

problem 3: not clear how to represent / provide API's for symlinks, filenames etc

A filename is a RelativeFile, a symlink has no syntax so depending on what it points to, it's one of the 4 distinct types.

@krux02
Copy link
Contributor

krux02 commented Oct 30, 2018

  • It is an internal API
  • If it is exposed at some point in the standard library, you don't need to use it. Write your own pathutils if this one does not suit your preferences.
  • or write converter functions locally in your project to automatically convert between all path types.

@krux02 krux02 closed this as completed Oct 30, 2018
@Araq
Copy link
Member

Araq commented Oct 30, 2018

It is a discussion of how to design a better ospaths, so let's please keep it open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants