-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* refactor * refactor * update functions * add notes
- Loading branch information
Showing
8 changed files
with
2,184 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,48 +1,16 @@ | ||
## List of functions | ||
# Public Documentation | ||
|
||
Every `Content` subclass has the following built-in functions: | ||
Documentation for `AwkwardArray.jl`'s public interface. | ||
|
||
* `Base.length` | ||
* `Base.size` (1-tuple of `length`) | ||
* `Base.firstindex`, `Base.lastindex` (1-based or inherited from its index) | ||
* `Base.getindex`: select by `Int` (single item), `UnitRange{Int}` (slice), and `Symbol` (record field) | ||
* `Base.iterate` | ||
* `Base.(==)` (equality defined by values: a `ListOffsetArray` and a `ListArray` may be considered the same) | ||
* `Base.push!` | ||
* `Base.append!` | ||
* `Base.show` | ||
See the Internals section of the manual for internal package docs covering all submodules. | ||
|
||
They also have the following functions for manipulating and checking structure: | ||
## Index | ||
|
||
* `AwkwardArray.parameters_of`: gets all parameters | ||
* `AwkwardArray.has_parameter`: returns true if a parameter exists | ||
* `AwkwardArray.get_parameter`: returns a parameter or raises an error | ||
* `AwkwardArray.with_parameter`: returns a copy of this node with a specified parameter | ||
* `AwkwardArray.copy`: shallow-copy of the array, allowing properties to be replaced | ||
* `AwkwardArray.is_valid`: verifies that the structure adheres to Awkward Array's protocol | ||
```@index | ||
Pages = ["api.md"] | ||
``` | ||
|
||
They have the following functions for filling an array: | ||
## Public Interface | ||
|
||
* `AwkwardArray.end_list!`: closes off a `ListType` array (`ListOffsetArray`, `ListArray`, or `RegularArray`) in the manner of Python's [ak.ArrayBuilder](https://awkward-array.org/doc/main/reference/generated/ak.ArrayBuilder.html) (no `begin_list` is necessary) | ||
* `AwkwardArray.end_record!`: closes off a `RecordArray` | ||
* `AwkwardArray.end_tuple!`: closes off a `TupleArray` | ||
* `AwkwardArray.push_null!`: pushes a missing value onto `OptionType` arrays (`IndexedOptionArray`, `ByteMaskedArray`, `BitMaskedArray`, or `UnmaskedArray`) | ||
* `AwkwardArray.push_dummy!`: pushes an unspecified value onto the array (used by `ByteMaskedArray` and `BitMaskedArray`, which need to have a placeholder in memory behind each `missing` value) | ||
|
||
`RecordArray` and `TupleArray` have the following for selecting fields (as opposed to rows): | ||
|
||
* `AwkwardArray.slot`: gets a `RecordArray` or `TupleArray` field, to avoid conflicts with `Base.getindex` for `TupleArrays` (both use integers to select a field) | ||
* `AwkwardArray.Record`: scalar representation of an item from a `RecordArray` | ||
* `AwkwardArray.Tuple`: scalar representation of an item from a `TupleArray` (note: not the same as `Base.Tuple`) | ||
|
||
`UnionArray` has the following for dealing with specializations: | ||
|
||
* `AwkwardArray.Specialization`: selects a `UnionArray` specialization for `push!`, `append!`, etc. | ||
|
||
Finally, all `Content` subclasses can be converted with the following: | ||
|
||
* `AwkwardArray.layout_for`: returns an appropriately-nested `Content` type for a given Julia type (`DataType`) | ||
* `AwkwardArray.from_iter`: converts Julia data into an Awkward Array | ||
* `AwkwardArray.to_vector`: converts an Awkward Array into Julia data | ||
* `AwkwardArray.from_buffers`: constructs an Awkward Array from a Form (JSON), length, and buffers for zero-copy passing from Python | ||
* `AwkwardArray.to_buffers`: deconstructs an Awkward Array into a Form (JSON), length, and buffers for zero-copy passing to Python |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
```@meta | ||
CurrentModule = AwkwardArray | ||
``` | ||
## List of [`Content`](@ref) functions | ||
|
||
Every [`Content`](@ref) subclass has the following built-in functions: | ||
|
||
* [`Base.length`](@ref) | ||
* [`Base.size`](@ref) (1-tuple of `length`) | ||
* [`Base.firstindex`](@ref), [`Base.lastindex`](@ref)(1-based or inherited from its index) | ||
* [`Base.getindex`](@ref) select by `Int`(single item), `UnitRange{Int}`(slice), and `Symbol`(record field) | ||
* [`Base.iterate`](@ref) | ||
* [`Base.:(==)`](@ref) (equality defined by values: a [`ListOffsetArray`](@ref) and a [`ListArray`](@ref) may be considered the same) | ||
* [`Base.push!`](@ref) | ||
* [`Base.append!`](@ref) | ||
* [`Base.show`](@ref) | ||
|
||
They also have the following functions for manipulating and checking structure: | ||
|
||
* [`AwkwardArray.parameters_of`](@ref) gets all parameters | ||
* [`AwkwardArray.has_parameter`](@ref) returns true if a parameter exists | ||
* [`AwkwardArray.get_parameter`](@ref) returns a parameter or raises an error | ||
* [`AwkwardArray.with_parameter`](@ref) returns a copy of this node with a specified parameter | ||
* [`AwkwardArray.copy`](@ref) shallow-copy of the array, allowing properties to be replaced | ||
* [`AwkwardArray.is_valid`](@ref) verifies that the structure adheres to Awkward Array's protocol | ||
|
||
They have the following functions for filling an array: | ||
|
||
* [`AwkwardArray.end_list!`](@ref): closes off a [`ListType`](@ref) array ([`ListOffsetArray`](@ref), [`ListArray`](@ref), or [`RegularArray`](@ref)) in the manner of Python's [ak.ArrayBuilder](https://awkward-array.org/doc/main/reference/generated/ak.ArrayBuilder.html) (no `begin_list` is necessary) | ||
* [`AwkwardArray.end_record!`](@ref) closes off a [`RecordArray`](@ref) | ||
* [`AwkwardArray.end_tuple!`](@ref) closes off a [`TupleArray`](@ref) | ||
* [`AwkwardArray.push_null!`](@ref) pushes a missing value onto [`OptionType`](@ref) arrays (`IndexedOptionArray`](@ref) [`ByteMaskedArray`](@ref) [`BitMaskedArray`](@ref) or [`UnmaskedArray`](@ref)) | ||
* [`AwkwardArray.push_dummy!`](@ref) pushes an unspecified value onto the array (used by [`ByteMaskedArray`](@ref) and [`BitMaskedArray`](@ref) which need to have a placeholder in memory behind each `missing` value) | ||
|
||
[`RecordArray`](@ref)and [`TupleArray`](@ref) have the following for selecting fields (as opposed to rows): | ||
|
||
* [`AwkwardArray.slot`](@ref) gets a [`RecordArray`](@ref)or [`TupleArray`](@ref) field, to avoid conflicts with [`Base.getindex`](@ref) for `TupleArrays` (both use integers to select a field) | ||
* [`AwkwardArray.Record`](@ref) scalar representation of an item from a [`RecordArray`](@ref) | ||
* [`AwkwardArray.SlotRecord`](@ref) scalar representation of an item from a [`TupleArray`](@ref)(note: not the same as `Base.Tuple`) | ||
|
||
[`UnionArray`](@ref)has the following for dealing with specializations: | ||
|
||
* [`AwkwardArray.Specialization`](@ref) selects a [`UnionArray`](@ref)specialization for [`push!`](@ref) [`append!`](@ref) etc. | ||
|
||
Finally, all [`Content`](@ref)subclasses can be converted with the following: | ||
|
||
* [`AwkwardArray.layout_for`](@ref) returns an appropriately-nested [`Content`](@ref)type for a given Julia type (`DataType`) | ||
* [`AwkwardArray.from_iter`](@ref) converts Julia data into an Awkward Array | ||
* [`AwkwardArray.to_vector`](@ref) converts an Awkward Array into Julia data | ||
* [`AwkwardArray.from_buffers`](@ref) constructs an Awkward Array from a Form (JSON), length, and buffers for zero-copy passing from Python | ||
* [`AwkwardArray.to_buffers`](@ref) deconstructs an Awkward Array into a Form (JSON), length, and buffers for zero-copy passing to Python | ||
|
||
|
||
## Array functions | ||
|
||
```@autodocs | ||
Modules = [AwkwardArray] | ||
Public = true | ||
Order = [:function] | ||
``` | ||
|
||
# Index | ||
|
||
```@index | ||
Pages = ["functions.md"] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
```@meta | ||
CurrentModule = AwkwardArray | ||
``` | ||
|
||
# Types | ||
|
||
```@index | ||
Pages = ["indexing.md"] | ||
``` | ||
|
||
## Indexing | ||
|
||
FIXME |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
```@meta | ||
CurrentModule = AwkwardArray | ||
``` | ||
|
||
# Types | ||
|
||
```@index | ||
Pages = ["internals.md"] | ||
``` | ||
|
||
## Internals | ||
|
||
FIXME |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
```@meta | ||
CurrentModule = AwkwardArray | ||
``` | ||
|
||
# Types | ||
|
||
## Array layout classes | ||
|
||
In Python, we make a distinction between high-level `ak.Array` (for data analysts) and low-level `Content` memory layouts (for downstream developers). In Julia, it's more advantageous to expose the concrete type details to all users, particularly for defining functions with multiple dispatch. Thus, there is no `ak.Array` equivalent. | ||
|
||
The layout classes (subclasses of `AwkwardArray.Content`) are: | ||
|
||
| Julia class | corresponding Python | corresponding Arrow | description | | ||
|:--|:--|:--|:--| | ||
| [`PrimitiveArray`](@ref) | [NumpyArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.NumpyArray.html) | [primitive](https://arrow.apache.org/docs/format/Columnar.html#fixed-size-primitive-layout) | one-dimensional array of booleans, numbers, date-times, or time-differences | | ||
| [`EmptyArray`](@ref) | [EmptyArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.EmptyArray.html) | _(none)_ | length-zero array with unknown type (usually derived from untyped sources) | | ||
| [`ListOffsetArray`](@ref) | [ListOffsetArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.ListOffsetArray.html) | [list](https://arrow.apache.org/docs/format/Columnar.html#variable-size-list-layout) | variable-length lists defined by an index of `offsets` | | ||
| [`ListArray`](@ref) | [ListArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.ListArray.html) | _(none)_ | variable-length lists defined by more general `starts` and `stops` indexes | | ||
| [`RegularArray`](@ref) | [RegularArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.RegularArray.html) | [fixed-size](https://arrow.apache.org/docs/format/Columnar.html#fixed-size-list-layout) | lists of uniform `size` | | ||
| [`RecordArray`](@ref) | [RecordArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.RecordArray.html) with `fields` | [struct](https://arrow.apache.org/docs/format/Columnar.html#struct-layout) | struct-like records with named fields of different types | | ||
| [`TupleArray`](@ref) | [RecordArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.RecordArray.html) with `fields=None` | _(none)_ | tuples of unnamed fields of different types | | ||
| [`IndexedArray`](@ref) | [IndexedArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.IndexedArray.html) | [dictionary](https://arrow.apache.org/docs/format/Columnar.html#dictionary-encoded-layout) | data that are lazily filtered, duplicated, and/or rearranged by an integer `index` | | ||
| [`IndexedOptionArray`](@ref) | [IndexedOptionArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.IndexedOptionArray.html) | _(none)_ | same but negative values in the `index` correspond to `Missing` values | | ||
| [`ByteMaskedArray`](@ref) | [ByteMaskedArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.ByteMaskedArray.html) | _(none)_ | possibly-missing data, defined by a byte `mask` | | ||
| [`BitMaskedArray`](@ref) (only `lsb_order = true`) | [BitMaskedArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.BitMaskedArray.html) | [bitmaps](https://arrow.apache.org/docs/format/Columnar.html#validity-bitmaps) | same, defined by a `BitVector` | | ||
| [`UnmaskedArray`](@ref) | [UnmaskedArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.UnmaskedArray.html) | same | in-principle missing data, but none are actually missing so no mask | | ||
| [`UnionArray`](@ref) | [UnionArray](https://awkward-array.org/doc/main/reference/generated/ak.contents.UnionArray.html) | [dense union](https://arrow.apache.org/docs/format/Columnar.html#dense-union) | data of different types in the same array | | ||
|
||
Any node in the data-type tree can carry `Dict{String,Any}` metadata as `parameters`, as well as a `behavior::Symbol` that can be used to define specialized behaviors. For instance, arrays of strings (constructed with `StringOffsetArray`, `StringArray`, or `StringRegularArray`) are defined by `behavior = :string` (instead of `behavior = :default`). | ||
|
||
## Types specification | ||
|
||
```@autodocs | ||
Modules = [AwkwardArray] | ||
Public = true | ||
Order = [:type] | ||
``` | ||
|
||
## Examples | ||
|
||
```julia | ||
julia> using AwkwardArray: StringOffsetArray | ||
|
||
julia> array = StringOffsetArray() | ||
0-element ListOffsetArray{Vector{Int64}, PrimitiveArray{UInt8, Vector{UInt8}, :char}, :string} | ||
|
||
julia> append!(array, ["one", "two", "three", "four", "five"]) | ||
5-element ListOffsetArray{Vector{Int64}, PrimitiveArray{UInt8, Vector{UInt8}, :char}, :string}: | ||
"one" | ||
"two" | ||
"three" | ||
"four" | ||
"five" | ||
|
||
julia> array[3] | ||
"three" | ||
|
||
julia> typeof(array[3]) | ||
String | ||
``` | ||
|
||
Most applications of `behavior` apply to `RecordArrays` (e.g. [Vector](https://github.com/scikit-hep/vector) in Python). | ||
|
||
## Index | ||
|
||
```@index | ||
Pages = ["types.md"] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.