Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.1.5 for @philipc's PR #11 #12

Merged
merged 29 commits into from
Sep 14, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
7dba710
Return wcwidth of 0 for combining characters
philipc Sep 2, 2015
72ef7fe
Use general category to determine zero width combining characters
philipc Sep 2, 2015
92b28f8
Use python3.5 to update unicode tables
jquast Sep 14, 2015
4deb563
Remove table_comb.py entirely, if it is unused!
jquast Sep 14, 2015
800363d
README.rst brevity and clarity
jquast Sep 14, 2015
45dcdbc
setup.py: remove develop tat, remove table_comb.py
jquast Sep 14, 2015
4baf950
move-to LICENSE file to compliment github
jquast Sep 14, 2015
b1e6d0d
move developer requirements over to file
jquast Sep 14, 2015
f8782ee
add python3.5 support to tox.
jquast Sep 14, 2015
78cdf05
ignore .DS_Store
jquast Sep 14, 2015
5644724
Revise docstrings of wcwidth & wcswidth.
jquast Sep 14, 2015
07cea7f
Brevity and use true sphinx format in docstring
jquast Sep 14, 2015
38d2e81
return to usedevelop=false (implicit)
jquast Sep 14, 2015
9f543e7
remove SetupDevelop target (erk!)
jquast Sep 14, 2015
410864a
use sudo: false for great travis speeds
jquast Sep 14, 2015
3fdd990
remove wcwidth-combining-comparator
jquast Sep 14, 2015
1349efb
add 'docent' to requirements-develop.txt
jquast Sep 14, 2015
5337aa3
remove custom 'setup.py develop' references
jquast Sep 14, 2015
179a10c
remove combining character from tests
jquast Sep 14, 2015
f0ab2b6
Prepare setup.py for 0.1.5 release
jquast Sep 14, 2015
88bea80
do static analysis via travis-ci py3.4
jquast Sep 14, 2015
1ecbbf8
bugfix duplicate wcwidth.c target in .rst
jquast Sep 14, 2015
ebba4d0
rename prospector->sa (static analysis)
jquast Sep 14, 2015
8e28f44
add static analysis to requirements-develop.txt
jquast Sep 14, 2015
c03620f
static analysis: 2 newlines between funcs
jquast Sep 14, 2015
67df6fa
prepare 0.1.5 changelog and docfix rst by linter
jquast Sep 14, 2015
f2eb4b9
travis-ci faux-shell semicolon fix
jquast Sep 14, 2015
d72c944
freeze develop dependencies
jquast Sep 14, 2015
426d748
resolve static analysis and 'usedevelop' tests
jquast Sep 14, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ docs/_build
htmlcov
.coveralls.yml
data
.DS_Store
4 changes: 4 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
language: python
sudo: false

env:
- TOXENV=py26
Expand All @@ -19,6 +20,9 @@ install:

script:
- tox -e $TOXENV
- if [[ $TOXENV == "py34" ]]; then
tox -esa;
fi

after_success:
- if [[ $TOXENV == "py34" ]]; then
Expand Down
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The MIT License (MIT)

Copyright (c) 2014 Jeff Quast <contact@jeffquast.com>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
201 changes: 71 additions & 130 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,58 +26,31 @@
Introduction
============

This API is mainly for Terminal Emulator implementors -- any python program
that attempts to determine the printable width of a string on a Terminal. It
is implemented in python (no C library calls) and has no 3rd-party dependencies.

It is certainly possible to use your Operating System's ``wcwidth(3)`` and
``wcswidth(3)`` calls if it is POSIX-conforming, but this would not be possible
on non-POSIX platforms, such as Windows, or for alternative Python
implementations, such as jython. It is also commonly many releases older
than the most current Unicode Standard release files, which this project
aims to track.

The most current release of this API is based from Unicode Standard release
*7.0.0*, dated *2014-02-28, 23:15:00 GMT [KW, LI]* for table generated by
file ``EastAsianWidth-7.0.0.txt`` and *2014-02-07, 18:42:08 GMT [MD]* for
``DerivedCombiningClass-7.0.0.txt``.
This API is mainly for Terminal Emulator implementors, or those writing
programs that expect to interpreted by a terminal emulator and wish to
determine the printable width of a string on a Terminal.

Installation
------------
Usually, the length of the string is equivalent to the number of cells
it occupies except that there are are also some categories of characters
which occupy 2 or even 0 cells. POSIX-conforming systems provide
``wcwidth(3)`` and ``wcswidth(3)`` of which this module's interface mirrors
precisely.

The stable version of this package is maintained on pypi, install using pip::
This library aims to be forward-looking, portable, and most correct. The most
current release of this API is based from Unicode Standard release files:

pip install wcwidth
``EastAsianWidth-8.0.0.txt``
*2015-02-10, 21:00:00 GMT [KW, LI]*

Problem
-------

You may have noticed some characters especially Chinese, Japanese, and
Korean (collectively known as the *CJK Unified Ideographs*) consume more
than 1 terminal cell. If you ask for the length of the string, ``u'コンニチハ'``
(Japanese: Hello), it is correctly determined to be a length of **5** using
the ``len()`` built-in.

However, if you were to print this to a Terminal Emulator, such as xterm,
urxvt, Terminal.app, PuTTY, or iTerm2, it would consume **10** *cells* (columns).
This causes problems for many of the text-alignment functions, such as ``rjust()``.
On an 80-wide terminal, the following would wrap along the margin, instead
of displaying it right-aligned as desired::

>>> text = u'コンニチハ'
>>> print(text.rjust(80))
コン
ニチハ
``DerivedGeneralCategory-8.0.0.txt``
*2015-02-13, 13:47:11 GMT [MD]*

Solution
--------
Installation
------------

This API allows one to determine the printable length of these strings,
that the length of ``wcwidth(u'コ')`` is reported as ``2``, and
``wcswidth(u'コンニチハ')`` as ``10``.
The stable version of this package is maintained on pypi, install using pip::

This allows one to determine the printable effects of displaying *CJK*
characters on a terminal emulator.
pip install wcwidth

wcwidth, wcswidth
-----------------
Expand All @@ -89,39 +62,45 @@ To Display ``u'コンニチハ'`` right-adjusted on screen of 80 columns::
>>> from wcwidth import wcswidth
>>> text = u'コンニチハ'
>>> print(u' ' * (80 - wcswidth(text)) + text)
コンニチハ

Return Values
-------------

``-1``
Indeterminate (not printable).

Values
------
``0``
Does not advance the cursor, such as NULL or Combining.

A general overview of return values:
``2``
Characters of category East Asian Wide (W) or East Asian
Full-width (F) which are displayed using two terminal cells.

- ``-1``: indeterminate (see Todo_).
- ``0``: do not advance the cursor, such as NULL.
- ``2``: East_Asian_Width property values W and F (Wide and Full-width).
- ``1``: all others.
``1``
All others.

``wcswidth()`` simply returns the sum of all values along a string, or
``-1`` if it has occurred for any value returned by ``wcwidth()``. A more
exacting list of conditions and return values may be found in the docstring
for ``wcwidth()``.
``-1`` in total if any part of the string results in -1. A more exact
list of conditions and return values may be found in the docstring::

$ pydoc wcwidth

Discrepacies
------------

There may be discrepancies with the determined printable width of of characters
by *wcwidth* and the results of any given terminal emulator -- most commonly,
emulators are using your Operating System's ``wcwidth(3)`` implementation which
is often based on tables much older than the most current Unicode Specification.
Python's determination of non-zero combining_ characters may also be based on an
older specification.
Discrepancies
-------------

You may determine an exacting list of these discrepancies using files
`wcwidth-libc-comparator.py`_ and `wcwidth-combining-comparator.py`_
This library does its best to return the most appropriate return value for a
very particular terminal user interface where a monospaced fixed-cell
rendering is expected. As the POSIX Terminal programming interfaces do not
provide any means to determine the unicode support level, we can only do our
best to return the *correct* result for the given codepoint, and not what any
terminal emulator particular does.

.. _`wcwidth-libc-comparator.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-libc-comparator.py
.. _`wcwidth-combining-comparator.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-combining-comparator.py
Python's determination of non-zero combining_ characters may also be based on
an older specification.

You may determine an exacting list of these discrepancies using the project
files `wcwidth-libc-comparator.py <https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-libc-comparator.py>`_ and `wcwidth-combining-comparator.py <https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-combining-comparator.py>`_.


==========
Expand All @@ -140,22 +119,20 @@ Updating Tables
The command ``python setup.py update`` will fetch the following resources:

- http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
- http://www.unicode.org/Public/UNIDATA/extracted/DerivedCombiningClass.txt
- http://www.unicode.org/Public/UNIDATA/extracted/DerivedGeneralCategory.txt

And generate the table files `wcwidth/table_wide.py`_ and `wcwidth/table_comb.py`_.
And generates the table files:

.. _`wcwidth/table_wide.py`: https://github.com/jquast/wcwidth/tree/master/wcwidth/table_wide.py
.. _`wcwidth/table_comb.py`: https://github.com/jquast/wcwidth/tree/master/wcwidth/table_comb.py
- `wcwidth/table_wide.py <https://github.com/jquast/wcwidth/tree/master/wcwidth/table_wide.py>`_
- `wcwidth/table_zero.py <https://github.com/jquast/wcwidth/tree/master/wcwidth/table_zero.py>`_

wcwidth.c
---------

This code was originally derived directly from C code of the same name,
whose latest version is available at: `wcwidth.c`_ And is authored by
Markus Kuhn -- 2007-05-26 (Unicode 5.0)

.. _`wcwidth.c`: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

whose latest version is available at
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c And is authored by Markus Kuhn,
2007-05-26 (Unicode 5.0).

Examples
--------
Expand All @@ -167,85 +144,49 @@ This library is used in:
- `jonathanslenders/python-prompt-toolkit`_, a Library for building powerful
interactive command lines in Python.

Additional tools for displaying and testing wcwidth is found in the ``bin/``
folder of this project (github link: `wcwidth/bin`_). They are not distributed
as a script or part of the module.
Additional tools for displaying and testing wcwidth are found in the `bin/
<https://in.linkedin.com/in/chiragjog>`_ folder of this project. They are not
distributed as a script or part of the module.

.. _`jquast/blessed`: https://github.com/jquast/blessed
.. _`jonathanslenders/python-prompt-toolkit`: https://github.com/jonathanslenders/python-prompt-toolkit
.. _`wcwidth/bin`: https://github.com/jquast/wcwidth/tree/master/bin

Todo
----

Though some of the most common ("zero-width") `combining`_ characters
are understood by wcswidth, there are still many edge cases that need
to be covered, especially certain kinds of sequences such as those
containing Control-Sequence-Inducer (CSI).


License
-------

The original license is as follows::

Permission to use, copy, modify, and distribute this software
for any purpose and without fee is hereby granted. The author
disclaims all warranties with regard to this software.

No specific licensing is specified, and Mr. Kuhn resides in the UK which allows
some protection from Copyrighting. As this derivative is based on US Soil,
an OSI-approved license that appears most-alike has been chosen, the MIT license::

The MIT License (MIT)

Copyright (c) 2014 <contact@jeffquast.com>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

Changes
-------

0.1.4
0.1.5 *2015-09-13 Alpha*
* **Bugfix**:
Resolution of "combining character width", most especially
those that previously returned -1 now often (correctly) return 0.
resolved by `Philip Craig`_ via `PR #11`_.

0.1.4 *2014-11-20 Pre-Alpha*
* **Feature**: ``wcswidth()`` now determines printable length
for (most) combining characters. The developer's tool
`bin/wcwidth-browser.py`_ is improved to display combining_
characters when provided the ``--combining`` option
(`Thomas Ballinger`_ and `Leta Montopoli`_ `PR #5`_).
* added static analysis (prospector_) to testing framework.

0.1.3
0.1.3 *2014-10-29 Pre-Alpha*
* **Bugfix**: 2nd parameter of wcswidth was not honored.
(`Thomas Ballinger`_, `PR #4`).
(`Thomas Ballinger`_, `PR #4`_).

0.1.2
0.1.2 *2014-10-28 Pre-Alpha*
* **Updated** tables to Unicode Specification 7.0.0.
(`Thomas Ballinger`_, `PR #3`).
(`Thomas Ballinger`_, `PR #3`_).

0.1.1
0.1.1 *2014-05-14 Pre-Alpha*
* Initial release to pypi, Based on Unicode Specification 6.3.0

.. _`prospector`: https://github.com/landscapeio/prospector
.. _`combining`: https://en.wikipedia.org/wiki/Combining_character
.. _`bin/wcwidth-browser.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-browser.py
.. _`Thomas Ballinger`: https://github.com/thomasballinger
.. _`Leta Montopoli`: https://github.com/lmontopo
.. _`Philip Craig`: https://github.com/philipc
.. _`PR #3`: https://github.com/jquast/wcwidth/pull/3
.. _`PR #4`: https://github.com/jquast/wcwidth/pull/4
.. _`PR #5`: https://github.com/jquast/wcwidth/pull/5
.. _`PR #11`: https://github.com/jquast/wcwidth/pull/11
10 changes: 5 additions & 5 deletions bin/wcwidth-browser.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
import signal

# local
from wcwidth import wcwidth, table_comb
from wcwidth.wcwidth import _bisearch, wcwidth, COMBINING

# 3rd-party
from blessed import Terminal
Expand Down Expand Up @@ -116,6 +116,7 @@ def __init__(self, width=2):
self.characters = (unichr(idx)
for idx in xrange(LIMIT_UCS)
if wcwidth(unichr(idx)) == width
and not _bisearch(idx, COMBINING)
)

def __iter__(self):
Expand Down Expand Up @@ -152,13 +153,13 @@ def __init__(self, width=1):
"""
self.characters = []
letters_o = (u'o' * width)
for boundaries in table_comb.NONZERO_COMBINING:
for boundaries in COMBINING:
for val in [_val for _val in
range(boundaries[0], boundaries[1] + 1)
if _val <= LIMIT_UCS]:
self.characters.append(letters_o[:1] +
unichr(val) +
letters_o[1:])
letters_o[wcwidth(unichr(val))+1:])
self.characters.reverse()

def __iter__(self):
Expand Down Expand Up @@ -647,8 +648,7 @@ def text_entry(self, ucs, name):
delimiter = style.attr_minor(style.delimiter)
if len(ucs) != 1:
# determine display of combining characters
val = ord(next((_ucs for _ucs in ucs
if wcwidth(_ucs) == -1)))
val = ord(ucs[1])
# a combining character displayed of any fg color
# will reset the foreground character of the cell
# combined with (iTerm2, OSX).
Expand Down
Loading