-
Notifications
You must be signed in to change notification settings - Fork 16
/
Copy pathTodo.xml
567 lines (492 loc) · 14.4 KB
/
Todo.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="XSL/Todo.xsl" ?>
<topics xmlns:r="http://www.r-project.org">
<title>CodeDepends</title>
<topic>
<title>CodeDepends</title>
<items>
<title>Issues</title>
<item>
sourceVariable() needs to find all of the functions
that are called in any function that is defined in the dependencies.
</item>
<item status="verify">
Fix getInputs so that it doesn't add the elements on the RHS to updates.
See ~/Projects/CaseStudies/BML/BML.Rdb. and sc[[45]]@outputs and specifically
adding grid to the updates..
</item>
<item status="verify">
Fix getInputs so that if it is just an update, then don't put in the outputs
See ~/Projects/CaseStudies/BML/BML.Rdb. and sc[[45]]@outputs.
</item>
<item>
getDepends
getVariables() also includes @updates, so found
then includes updates, but not
<r:code>
sc = readScript("~/Projects/CaseStudies/BML/BML.Rdb")
inputs = getInputs(sc)
getDepends("createGrid", sc)
</r:code>
</item>
<item>
Handle calls to substitute.
Also quote.
</item>
<item>
Don't seem to identify x as an output in
<r:code>
ans = numeric(length(x))
for(i in seq(along =x)[-1])
x[i] = x[i-1]*2
ans
</r:code>
Should be for the second expression - the for loop where body updates x.
</item>
<item>
Get getInputs() from the body of a function, an expression, etc.
to behave the same as reading from a script.
See RLLVMCompile/explorations/deadVariables.R
</item>
<item>
Is getInputs () method for function correct in having the names of the formals
be outputs and not inputs?
</item>
<item>
findWhenUnneeded probably need to query the updates also.
If an update is done as he last use of a variable, this is currently missed.
(See RLLVMCompile explorations/script.R)
</item>
<item>
findWhenUnneeded returns NA for a variable that is never used, just defined.
Od we want 0 or 1 or where it is defined?
</item>
<item>
<r:expr>if (as.character(fn) == "close")</r:expr>.
fn can be of the form x$foo and so have length greater than 1.
</item>
<item> Back again:<br/>
sourceVariable doesn't seem to run the function definition expressions.
<br/>
I've added facilities to do this, identifying functions as
either locally defined within the script or not, or unknown.
<br/>
In CaseStudies/PairsTrading/
<r:code>
sc = readScript("duncanSol.Rdb")
sourceVariable("r", , sc)
</r:code>
</item>
<item status="check">
sourceVariable seems to run up to but not including the expression to create c
<br/>
Actually it seems we just need force = TRUE.
<r:code>
sc = readScript("duncanSol.Rdb")
sourceVariable("c", , sc)
</r:code>
</item>
<item status="check">
getVariables(sc[[3]]) doesn't seem to work. No method for an expression.
<br/>
But if we use a ScriptNodeInfo rather than an expression, this is fine.
<br/>
Method added for expression.
</item>
<item status="done">
Names on a script have a preceeding space.
Is this makeTaskNames with names(x)[1] being empty? Yes.
<br/>
Now we trim the names.
</item>
<item>
Add in locally defined functions in plot(getDetailedTimelines(.)).
<br/>
Can use functions = TRUE. However, functions = NULL doesn't turn all
of the functions off.
<br/>
[Fixed] When we include functions, c shows up define many places.
<br/>
getDetailedTimeline now uses the functions argument in the call to getVariables.
</item>
<item status="done">
Allow suppressing the display of the expressions in readScript.
<br/>
Added a ... and this is passed on to readXMLScript and any other fragment readers.
Use verbose = FALSE for XML documents.
</item>
<item>
Show functions being used in time lines,
i.e. color them differently.
<br/>
They seem to now appear as being redefined!
</item>
<item>
Make an interactive, SVG plot of the timelines.
Show the code when we mouseover or click on a point.
</item>
<item>
Connect the SVG display to an HTML document with the code parts.
Allow the viewer to navigate the document via the plot.
</item>
<item status="ok">
duncalSol.Rdb in Projects/CaseStudies/PairsTrading/
shows a red before a green in getProfit.K (?)
<br/>
This is fine as we do reference the function before we
define it.
<r:code>
sc = readScript("duncanSol.Rdb")
plot(getDetailedTimelines(sc))
</r:code>
</item>
<item status="verify">
Seems to be working:
<br/>
Recognize TRUE/FALSE, literals, @slotname, $name, not as regular variables.
<br/>
I think we probably don't want to include these literals as variables.
They are literals/constants. And we don't seem to include them in ScriptNodeInfo.
See
<r:code>
sc = readScript("inst/samples/literal.R")
as(sc, "ScriptInfo")
</r:code>
</item>
<item>
Variables in loops - when they are local to the loop, recognize this.
See inst/samples/loop.R. i is not defined until the loop.
We seem to need a new concept to introduced here.
</item>
<item status="done">
highlightCode( inline = TRUE)
doesn't include the CSS and the JS doesn't work.
<br/>
Was using <xml:tag>link</xml:tag> for inline version when I
should have been using <xml:tag>style</xml:tag>.
</item>
<item>
Change the CSS for highlight.css to get better colors and fonts.
</item>
<item>
Display code in HTML and have a plot
of the task graph, or the detailed timeline
and link the two.
Allow highlighting a variable in the plot and
then show the variables in the code.
Two modes - where variables are assigned and when they are used.
</item>
<item>
How are getExpressionThread and getVariableDepends different?
</item>
<item>
findWhenUnneeded misses variables used in a formula. See example in Rd file for x.
Use cleanVars1.R.
Also, we remove the data frame which the fit object may reference.
</item>
<item>
Update functions and methods to allow a DynScript where a regular Script can be.
Just coerce. and/or method for getInputs()
</item>
<item>
Add support for include formula variables in analysis
of getPropagateChanges(), etc.
</item>
<item>
Determine when a code chunk redefines a variable.
This changes the dependencies
<r:code>
p = foo()
p = 1
bar(p)
</r:code>
<r:code>
getInputs(quote({p = foo();p = 1; bar(p)}))
</r:code>
gives p as an output and no input.
We could say updates p.
<br/>
What about
<r:code>
getInputs(quote({p = p + 1; bar(p)}))
</r:code>
This has p as inputs and outputs. This is correct.
</item>
<item>
Would like to be able to update the script, see what expressions have changed
and then use that information when using sourceVariable().
See if Gabe has done something on this.
</item>
<item status="done">
If a variable is named missing, we don't count it in the dependencies.
<br/>
Yes we do. See inst/samples/missing.R
</item>
<item status="check">
Identify as a dependency a variable that is used
as the default value of a function that is called in our
expressions.
See fgets.Rdb in Rllvm/explorations.
<br/>
<r:code>
sc = readScript(system.file("samples", "defaultValue.R", package = "CodeDepends"))
k = as(sc, "ScriptInfo")
k[[3]]@inputs
k[[4]]@inputs
</r:code>
</item>
<item>
(Recursively) find global variables in functions.
See fgets.Rdb in Rllvm/explorations.
</item>
<item>
Come up with a mechanism that identifies
side effects via closures.
See for example the RClangSimple/RCIndex examples (notDone.R)
where we call parseTU(file, col$fun) and col$fun is a closure.
<br/>
This is slightly tricky as we want to mark col as being updated,
not col$fun. We could identify col as a reference type or mutable
and hence if it is passed to a function, we can consider it mutated
or possibly so. We can also analyze the function to which it is passed
and see if it uses the mutable parts, and do this recursively to the functions
it calls and so on.
</item>
<item>
Function to determine if a function (potentially) modifies its arguments.
Put another way, determine which parameters are const.
This can be part of a type inference package I am working on
(github.com:duncantl/RTypeInference).
</item>
<item>
Function that just revaluates the command to define a variable,
e.g. <r:expr>sc$reval('foo')</r:expr>
So this is for when we don't want to find the command and
have the script.
</item>
<item status="done">
getSectionDepends does'nt seem to recognize need for hout in RClangSimple/inst/TU/routinese.R
when source'ing <r:var>done</r:var>.
</item>
<item>
Add more checks for sideEffects. See R/sideEffects.R
Just <r:expr>cat(, file = sym)</r:expr> here.
These pretty much need to be done manually.
</item>
<item status="done">
In sourceVariable and getVariableDepends, recognize
side effects such as <r:expr>cat(, file = var)</r:expr>.
<br/>
For an example, see routines.R and hout in RClangSimple/inst/TU (Apr 23 '13)
<br/>
Should we capture these as sideEffects or in the updates slot?
</item>
<item>
Make the Script object something we can do something like
<r:expr> sc[["foo"]] </r:expr>
to run the code to get foo and the return foo.
This is just syntax for sourceVariable.
<br/>
Doesn't sc$name do this already.
<br/>
FIX: But it doesn't handle function definitions.
</item>
<item>
For tasks/nodes that have a single output, put the name of the variable
as the name of the task.
</item>
<item>
Allow sourceVariable() to check what variables
already exist and just run the code for those that don't.
This may not always be what we want, but it can be useful
when we have different tasks some of which share the same inputs.
When we run one task and then another, we don't want to necessarily run
the shared code again.
</item>
<item>
Make a reference Script object that checks to see if the file has been updated
and changes its information when it has.
</item>
<item status="extend">
sourceVariable() should load packages.
Ideally determine which functions/variables/symbols come from
which packages.
If the functions don't exist when we go to evaluate the call,
then load the packages in earlier.
</item>
<item status="done">
Make sourceVariable recognize if it is given a script as the second argument.
</item>
<item status="done">
sourceVariable is repeating the same commands. Need
to ensure getVariableDepends() returns unique expressions.
<br/>
<r:func>sapply</r:func> rather than <r:func>lapply</r:func>
led to sort on columns.
</item>
<item status="high">
<r:code><![CDATA[
getInputs(quote(x[i] <- x[i] + 2))
]]></r:code>
says that i is an output! It is on the
LHS, but it is not an output. Need to be more precise.
</item>
<item>
Handle dget, .readRDS, etc. as potentially reading files.
</item>
<item status="verify">
When an assignment is not a simple name, take the name
of the first variable mentioned and use that as an input.
<br/>
Test to see if it is working. Changed the is.name(e[[2]][[2]]) condition.
Need to be more general.
<br/>
<r:code><![CDATA[
getInputs(quote( x[[3]]$foo <- k + g(z)))
]]></r:code>
</item>
<item status="done">
Computing inputs is not getting covs in
sitepairs.R, specifically the 28th expression.
<r:code>
s = readScript("inst/samples/sitepairs.R")
getInputs(s[[28]])
</r:code>
<br/>
For some silly reason, I was excluding variables that were also in the outputs.
</item>
<item status="verify">
Seems to be working - April 1, 2014
<br/>
Names of assignment functions not being added to functions in
getInputs. E.g. in
<r:code><![CDATA[
getInputs(quote(colnames(covs) <- paste(xcolnames(covs), "c", sep = ".")))
]]></r:code>
should be catching the colnames.
</item>
<item>
Get the graphs working properly.
<r:code>
sc = readScript("inst/samples/parallel.R")
g = makeVariableGraph(,sc)
library(Rgraphviz)
plot(g)
</r:code>
</item>
<item status="check">
Don't include in the variables the right hand side of a $ call,
i.e. x$foo should only reference x.
See getInputs.language in codeDepends.
See
<r:code>
sc = readScript("inst/samples/dollar.R")
i = getInputs(sc)
</r:code>
or more simply
<r:code><![CDATA[
getInputs(expression(x$a <- 1))
]]></r:code>
The assignment includes the rhs, i.e. i[[3]]
<br/>
We may want to introduce a new class of "inputs" and "outputs"
named, e.g., "modified" to reflect inline modifications
via x$foo and x[i] and x[i, j]
</item>
<item status="verify">
Document the undocumented functions.
<r:code>
library(tools)
undoc("CodeDepends", ".")
</r:code>
</item>
<item status="check">
Handle complex assignments (i.e. left hand sides that are not simple names)
where we add them to the outputs.
</item>
<item>
getExpressionThread() and redefined variables and infinite loops.
</item>
</items>
<items>
<item>
Follow load() and source() commands to identify variables they provide.
Recursively process source() commands.
Get the table of contents for a load() file.
</item>
<item>
Introduce a SourceNodeInfo that extends ScriptNodeInfo and
indicates this is a source command.
Then introduce other important concepts that are not generic.
</item>
<item>
Determine parallel/non-dependent blocks.
</item>
<item status="done">
Determine the point at which a variable is no longer needed
and be able to add code to remove it, e.g. a cleanup section
for each code block.
<br/>
This is as simple as looking at the info for later code blocks
and see if a variable is in the inputs of any of these.
<br/>
See addRemoveIntermediates.
</item>
<item status="done">
Create a graph connecting the expressions with edges going
from an expression block that is an input to another task.
<br/>
makeGraph
</item>
<item>
Add support for going to individual expressions and not just blocks.
i.e. so that we can extract subsets of a block when updating.
<br/>
The existing code will do all of the hard work. The issue
is to arrange the results within a code block/expression
based on the individual calls.
This is not hard, just a change to our setup.
So we will want to introduce new classes to preserve the
existing material while introducing new approaches.
</item>
<item status="done">
getInputs() on an R function (e.g. arima0) seems
to have a lot of redefinitions.
<r:code>
info = getInputs(arima0)
dtm = getDetailedTimelines(, ff)
plot(dtm)
</r:code>
<br/>
Problem was I was reusing the same collector() for
all expressions in case the caller gave us a collector.
As a result, we were accumulating results across expressions.
</item>
<item status="low">
Plot of the timeline of variables.
<br/>
See getDetailedTimelines() and its plot method.
<br/>
We should make this more polished (fix up lines that go to the end),
and think about what else we should really be plotting.
</item>
<item status="done">
Find rm() commands.
</item>
</items>
</topic>
<topic>
<title>Alternatives and Threads</title>
<items>
<item>
Deal with alternatives in the document.
</item>
<item>
Draw the tasks as a flow.
<br/>
Think about alternatives at this point, but of course they don't really belong here
necessarily.
</item>
</items>
</topic>
</topics>