Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random freezes with @parallel decorator #14154

Open
miguelmarco opened this issue Feb 20, 2013 · 8 comments
Open

Random freezes with @parallel decorator #14154

miguelmarco opened this issue Feb 20, 2013 · 8 comments

Comments

@miguelmarco
Copy link
Contributor

I get random freezes of a parallel process with the following code:

def trenza(f,points,exact=True,step=0.1,precision=53):
    if len(points)>2:
        return trenza(f,points[:2],exact,step,precision),trenza(f,points[1:],exact,step,precision)
    F=ComplexField(precision)
    x0=F(points[0])
    x1=F(points[1])
    d=abs(F(x0)-F(x1))
    (x,y)=f.parent().gens()
    y0s=f(x0,QQbar[y].gen()).roots(multiplicities=False)
    dfx=f.derivative(x)
    dfy=f.derivative(y)
    RX=PolynomialRing(F,'x')
    RY=PolynomialRing(F,'y')
    R=PolynomialRing(F,'x,y')
    Rext=PolynomialRing(F,'X0,Y0,x,y,D')
    diffs=filter(lambda a:a!=0,[f.derivative(y,k) for k in range(f.degree()+1)])
    Ak=[Rext(g(Rext('x'),Rext('Y0')+Rext('D')*(x-Rext('X0')))) for g in diffs]
    args=[(f,x0,x1,y0,d,Ak,R,F,RX,x.change_ring(F),y.change_ring(F),RY,dfx,dfy,exact,step) for y0 in y0s]
    l=list(siguehilo(args))
    

@parallel
def siguehilo(f,x0,x1,y0a,d,Ak,R,F,RX,x,y,RY,dfx,dfy,exact,stepx):
    t=F(0)
    y0=F(y0a)
    xi=x0
    puntos=[]
    sigue=True
    uno=False
    pr=F(2)^-(F.precision()-2)
    while t<F(1) or sigue:
        g=f(xi,y).polynomial(y)
        y2=RY(g).newton_raphson(8,y0)
        #while abs(y1-y2)>pr*16:
        #    [y1,y2]=RY(g).newton_raphson(2,y2)
        y0=y2[-1]
        puntos.append([t,y0])
        d0=F(-dfx(xi,y0)/dfy(xi,y0))
        h=1
        if exact:
            pr=2^-(F.precision()-1)
            FR=RealIntervalField(F.precision())
            FC=ComplexIntervalField(F.precision())
            R=PolynomialRing(FC,'x,y')
            RX=PolynomialRing(FC,'x')
            xx0=FC(FR(xi.real()-pr,xi.real()+pr)+FC(I)*FR(xi.imag()-pr,xi.imag()+pr))
            yy0=FC(FR(y0.real()-pr,y0.real()+pr)+FC(I)*FR(y0.imag()-pr,y0.imag()+pr))
            dd=FC(d0)
            Aka=[j.change_ring(FC) for j in Ak]
            akt=[(j(xx0,yy0,x.change_ring(FC)+xx0,0,dd)) for j in Aka]
            akt=[sum([a[0].abs()*RX(a[1]) for a in R(hh)]) for hh in akt]
            a1t=-akt[1]+2*akt[1].coeffs()[0]
            akt[1]=a1t
            L=filter(lambda a: a!=0,akt)
            chequea=False
            h=1
            while not chequea:
                chequea=True
                k=2
                while chequea and k<len(L):
                    L1=(L[k](h)*L[0](h)^(k-1))
                    L2=(QQ(0.157670780786)^k*factorial(k)*L[1](h))
                    if not L2>=L1:
                        chequea=False
                    k+=1
                h=h/2
        else:
            h=F(stepx)
        t+=h/d
        if uno:
            sigue=False
        if t>F(1):
            t=F(1)
            uno=True
        xj=x0*(1-t)+x1*t
        y0+=d0*(xj-xi)
        xi=xj
    return puntos

It is a code i am writing for computing braid monodromy of curves.

If i run it with, for example, this input:

R.<x,y>=QQ[]
f=-y^3+x^2
time trenza(f,[1,I,-1,-I,1],exact=False,step=0.5)

I usually get the answer in a few seconds, but if i repeat several times the same computation, at some point it freezes, as if it was computing for a long time.

When i interrupt the computation i get the following traceback:

^C
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "_sage_input_33.py", line 10, in <module>
    exec compile(u'open("___code___.py","w").write("# -*- coding: utf-8 -*-\\n" + _support_.preparse_worksheet_cell(base64.b64decode("dGltZSB0cmVuemEoZixbMSxJLC0xLC1JLDFdLGV4YWN0PUZhbHNlLHN0ZXA9MC41KQ=="),globals())+"\\n"); execfile(os.path.abspath("___code___.py"))
  File "", line 1, in <module>
    
  File "/tmp/tmpFpew9q/___code___.py", line 3, in <module>
    exec compile(u'__time__=misc.cputime(); __wall__=misc.walltime(); trenza(f,[_sage_const_1 ,I,-_sage_const_1 ,-I,_sage_const_1 ],exact=False,step=_sage_const_0p5 ); print "Time: CPU %.2f s, Wall: %.2f s"%(misc.cputime(__time__), misc.walltime(__wall__))
  File "", line 1, in <module>
    
  File "/tmp/tmpOfjSpn/___code___.py", line 5, in trenza
    return trenza(f,points[:_sage_const_2 ],exact,step,precision)*trenza(f,points[_sage_const_1 :],exact,step,precision)
  File "/tmp/tmpOfjSpn/___code___.py", line 21, in trenza
    l=list(siguehilo(args))
  File "/home/mmarco/sage-5.7.beta3/local/lib/python2.7/site-packages/sage/parallel/use_fork.py", line 189, in __call__
    os.wait()
  File "c_lib.pyx", line 68, in sage.ext.c_lib.sage_python_check_interrupt (sage/ext/c_lib.c:736)
KeyboardInterrupt
__SAGE__

I really don't know how to catch the bug, and how to debug it.

Depends on #14150

Component: memleak

Keywords: parallel

Issue created by migration from https://trac.sagemath.org/ticket/14154

@jdemeyer
Copy link

comment:1

It would be good to provide more minimal code exhibiting the problem.

I'm setting the dependency simply because because any patch here might conflict with #14150.

@jdemeyer
Copy link

Dependencies: #14150

@jdemeyer
Copy link

comment:2

Also, it seems you are running in the notebook. Does running it in the command-line make a difference?

@miguelmarco

This comment has been minimized.

@miguelmarco
Copy link
Contributor Author

comment:4

I have trimmed down a little bit the code, but it is still very big.

I have tested on two different systems and the probabilities of hitting the problem seem to vary a lot. I have also experienced the same problem on the command line.

To trigger it i have to try several posibilities for the "exact" and "step" parameters. A combination that seems to work more often is this:

[trenza(f,[1,I,-1,-I,1],exact=True,step=0.5) for i in range(5)]

Each separated instance of

trenza(f,[1,I,-1,-I,1],exact=True,step=0.5) 

takes around 3 seconds in my computer. But the list of five iterations doesn't give any answer even after several minutes.

@jdemeyer
Copy link

jdemeyer commented Apr 1, 2013

comment:5

If would be really good if you could simplify the code to better find out where it goes wrong.

@vbraun
Copy link
Member

vbraun commented Apr 1, 2013

comment:6

Whats the expected output?

sage: [trenza(f,[1,I,-1,-I,1],exact=True,step=0.5) for i in range(5)]
[(None, (None, (None, None))),
 (None, (None, (None, None))),
 (None, (None, (None, None))),
 (None, (None, (None, None))),
 (None, (None, (None, None)))]
sage: trenza(f,[1,I,-1,-I,1],exact=True,step=0.5) 
(None, (None, (None, None)))

Loop or no loop makes no difference here on Fedora 18 x86_64. Which OS are you on?

@miguelmarco
Copy link
Contributor Author

comment:7

The expected is basically no output (the trenza function returns nothing, it is just to trigger the problem). As i said, the problem appears somehow randomly. Did you try to run it several times?

And as Murphy's law dictates, now the problem doesn't show in my system either ;)

I checked in both a Mageia server and my gentoo box. Both x86_64, with sage-5.7 compiled from source.

@jdemeyer jdemeyer modified the milestones: sage-5.11, sage-5.12 Aug 13, 2013
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.1, sage-6.2 Jan 30, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.2, sage-6.3 May 6, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.3, sage-6.4 Aug 10, 2014
@mkoeppe mkoeppe removed this from the sage-6.4 milestone Dec 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants