Releases: sassoftware/saspy
V3.6.4
This release has three new features. Two new methods to compliment dirlist() and file_info() for server side file system access: file_copy() and file_delete(). Those are pretty self explanatory, and are in the API doc.
The other new feature is from Issue #353, adding the ability to use the STDIO over SSH (only over SSH) from a Windows client. STDIO (SSH or not) can only access a stand alone Linux SAS install. But this feature allows the Python process to be on a windows
client as opposed to a Linux client; which has been the requirement since I first wrote it.
V3.6.2
This release just has a few fixes/enhancements. One fix is to the HTTP access method, disabling the encoding option. Unlike the other access methods, where saspy has to transcode everything between pythons utf-8 and the SAS session encoding, the HTTP API being used requires everything to be utf-8 and it transcodes to/from SAS session encoding. Easy fix for saspy, and it's in here in this release.
There's a new user contributed method, validvarname, which renames the columns of your dataframe to match the SAS 'validvarname' constraints for however that option is set. This would be used prior to passing that dataframe to df2sd(). So now if you have that set to a more restrictive value, you can use this method to convert your column names to be compliant. This is something Access Engines do, based up the option setting, when accessing database table, which allow names that don't comply.
There are a couple other fixes, one in COM to fix a case where only missing values in a numeric SAS data set becomes a char type in the dataframe instead of numeric when importing in sd2df(). And another to allow overrides for index_col= and engine=, also in sd2df. These had been hard coded in sd2df, but can not be user specified.
V3.6.1
This release just has 3 fixes in it. One to fix a regression for STDIO when a dataframe is empty; still create the empty dataset. That was a break in 3.6.0. Fix an import error that can happen; depends on maybe what other modules you have installed? Either way, it's fixed for all cases now. That was in IOM access method. And, there was a code path in tail() where an error could happen. Fixed that too in this release. That's all.
V3.6.0
This release has a number of enhancements for df2sd (dataframe2sasdata). This started as performance changes for #326 and included more for #332. The main change had to do with calculating the lengths for char columns of the dataframe, which has to be done to declare the correct byte lengths for the corresponding SAS variables in the SAS data set being created. With a DF having 150 million rows and 100 char columns, this step was taking way too long. I separated out this step from df2sd (df_char_lengths()) so it can be called independently (by the user or by the access methods df2sd), returning a dict with the char column names and lengths. I also made enhancements to this routine to be able to shortcut some of the time calculating lengths so it could be quicker. df2sd can take these options for when it calls this internally, but it can also take a dict with the char column names and lengths (that is returned by that method, or you can just code that yourself so that the metadata calculation step can be done once, or skipped altogether and just go to the data transfer. I also enhanced the data transfer step in the STDIO access method significantly too. Handling transcoding failure is now handled in the data transfer step (though it can still be caught in the length calc routine if wanted), and you now have the option of replacing chars that can't be transcoded, with the replacement char, instead of failing. So there's a lot of new functionality and performance improvements that can be tapped into in this version for df2sd. The default behaviors, for the most part, are still the same as they were. So if df2sd seems too slow, there are a number of ways to improve it's performance in this version, by tweaking these options.
Oh, and I almost forgot, df2sd also now has an outdsopts={...} parameter which allows you to specify key=value output data set options for the data set being created: for instance, compress=, encoding=, index=, outrep=, replace=, rename= ...
V3.5.4
This release only has a few enhancements in it. The symexist and symget methods have a fix for a macro named 'id' as there was a parsing issue with that one specific name. symput has a new option so you can specify the specific SAS quoting function to use, as the default doesn't always work. SASPy tries to remove as many SAS'isms from the python code, by defaulting behavior for the more usual cases. But, SAS has no end of options and varying statements to tweak and special case things, that, sometimes you just have to be able to specify specific SAS syntax for certain cases.
The other fix in this release was due to the SPDE engine not supporting (case in point from above!) the common OBS and FIRSTOBS data set options. It has it's own names for these, and sometimes, will allow OBS= and sometimes not. So, I've added support for this in the methods where I generate any of these options (head/tail) so that they work right with this engine as well as others. Also, as part of this, I found that head and tail weren't exactly honoring these options when specified in the DSOPTS of the SASdata object. Now the head and tail set is accurately based off of the result set defined by these options when in DSOPTS.
V3.5.3
V3.5.2
This is another minor release. The one fix in here is the the lastlog() method. The original implementation simply set lastlog to the log from the last submit() call. That was for any method, and for methods which only submitted one block of code, that was fine. But there a many methods which need to run multiple blocks of code, and for those it only returned the part of the log for the last submission. Now the lastlog returns the whole log for everything run for each method. This allows a quick and easy way to see just the log from the method that was run. The whole log (saslog()) get's too big to use to just look at the results from one method. Now looking at, and programmatically assessing the log (check for errors or unexpected messages), is easy to do since the whole log of the code run is available.
V3.5.1
This is a small release. It has a couple fixes to the HTTP access method which were found in house testing. It also has a new feature from a user request, #317 which allows you to easily created SAS data sets in encodings other than the SAS session encoding. 'outencoding=' is an option on df2sd now, and it takes a valid SAS encoding value. This also now has a new 'encoding' key in the dsopts dictionay to track this different encoding for the SASdataset. When set, the value will be specified in the encoding= data set option for generated code for this data set.
V3.5.0
This release has a few enhancements to the HTTP access method, including the url= key, which can be used instead of the ip= and port= and ssl= keys, as all of that will be determined from the url when in the format 'http[s]://host.identifier[:port]'.
But, the most significant changes in this release are to df2sd and various sd2df methods. There was a bug found regarding parsing the data values in each direction, where a leading double quote (with no trailing one) in a character columns could cause both SAS and Pandas to use 'csv' parsing rules that ignore the delimiters being used to delimit the raw data values and honor the quotes. This caused failures to be able to move data correctly in each direction; parsing column data incorrectly. These methods have always used a delimiting strategy to transfer data both directions, so as to not send blank padded, full length data (SAS only has fixed length padded character columns), and to not need to parse and manipulate the data, adding extra quoting and escaping data values. The fixes and enhancements in this version address these edge cases found here and make transfers in both directions more robust. More details can be found in PR #314.
V3.3.7
This release has mostly enhancements to the HTTP access method, which were driven by using this for internal testing of SAS Viya. Support for more diagnostic messages and exception handling and better messages are in here. Also, the ability to disable prompting from saspy for cases where a script is running in the background and can't receive input; set prompt=False to get a failure instead of a hang if a prompt would be necessary. Also a tweak to the symget outtype= parameter do it will accept 'int' or 'float' (strings) to mean int (1) or float (1.0) types while continuing to take objects of the types.