Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pubsub sometimes hangs on Windows #4778

Open
vmx opened this issue Mar 6, 2018 · 5 comments
Open

pubsub sometimes hangs on Windows #4778

vmx opened this issue Mar 6, 2018 · 5 comments

Comments

@vmx
Copy link
Member

vmx commented Mar 6, 2018

Version information:

go-ipfs version: 0.4.13-
Repo version: 6
System version: amd64/windows
Golang version: go1.9.2

Type: Bug

Description:

pubsub test sometimes hangs when running through js-ipfs-api. Steps to reproduce:

  • Get a working environment on Windows:
@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"
choco install googlechrome
choco install nodejs
choco install git
npm install -g windows-build-tools
RefreshEnv
  • git clone https://github.com/ipfs/js-ipfs-api pubsubbug
  • cd pubsubbug
  • npm install
  • replace the file in node_modules\interface-ipfs-core\js\src\pubsub.js with the code below
  • npx mocha test\interface\pubsub.spec.js --exit

Sometimes not all 100 messages will be received and the test will time out.

/* eslint-env mocha */
/* eslint max-nested-callbacks: ['error', 8] */
'use strict'

const chai = require('chai')
const dirtyChai = require('dirty-chai')
const expect = chai.expect
chai.use(dirtyChai)
const series = require('async/series')
const waterfall = require('async/waterfall')
const parallel = require('async/parallel')
const whilst = require('async/whilst')
const each = require('async/each')
const hat = require('hat')

function waitForPeers (ipfs, topic, peersToWait, callback) {
  const i = setInterval(() => {
    ipfs.pubsub.peers(topic, (err, peers) => {
      if (err) {
        return callback(err)
      }

      const missingPeers = peersToWait
        .map((e) => peers.indexOf(e) !== -1)
        .filter((e) => !e)

      if (missingPeers.length === 0) {
        clearInterval(i)
        callback()
      }
    })
  }, 500)
}

function spawnWithId (factory, callback) {
  waterfall([
    (cb) => factory.spawnNode(cb),
    (node, cb) => node.id((err, res) => {
      if (err) {
        return cb(err)
      }
      node.peerId = res
      cb(null, node)
    })
  ], callback)
}

module.exports = (common) => {
  describe('.pubsub', function () {
    this.timeout(80 * 1000)

    const getTopic = () => 'pubsub-tests-' + hat()

    let ipfs1
    let ipfs2
    let ipfs3

    before(function (done) {
      // CI takes longer to instantiate the daemon, so we need to increase the
      // timeout for the before step
      this.timeout(100 * 1000)

      common.setup((err, factory) => {
        if (err) {
          return done(err)
        }

        series([
          (cb) => spawnWithId(factory, cb),
          (cb) => spawnWithId(factory, cb),
          (cb) => spawnWithId(factory, cb)
        ], (err, nodes) => {
          if (err) {
            return done(err)
          }

          ipfs1 = nodes[0]
          ipfs2 = nodes[1]
          ipfs3 = nodes[2]
          done()
        })
      })
    })

    after((done) => {
      common.teardown(done)
    })

    describe('multiple nodes connected', () => {
      before((done) => {
        parallel([
          (cb) => ipfs1.swarm.connect(ipfs2.peerId.addresses.find((a) => a.includes('127.0.0.1')), cb),
          (cb) => ipfs2.swarm.connect(ipfs3.peerId.addresses.find((a) => a.includes('127.0.0.1')), cb),
          (cb) => ipfs1.swarm.connect(ipfs3.peerId.addresses.find((a) => a.includes('127.0.0.1')), cb)
        ], (err) => {
          if (err) {
            return done(err)
          }
          // give some time to let everything connect
          setTimeout(done, 300)
        })
      })

      describe('load tests', function () {
        before(() => {
          ipfs1.pubsub.setMaxListeners(10 * 1000)
          ipfs2.pubsub.setMaxListeners(10 * 1000)
        })

        after(() => {
          ipfs1.pubsub.setMaxListeners(10)
          ipfs2.pubsub.setMaxListeners(10)
        })


        describe('send/receive', () => {
          let topic
          let sub1
          let sub2

          before(() => {
            topic = getTopic()
          })

          after(() => {
              ipfs1.pubsub.unsubscribe(topic, sub1)
              ipfs2.pubsub.unsubscribe(topic, sub2)
          })

          it('send/receive 10k messages', function (done) {
            this.timeout(2 * 60 * 1000)

            const msgBase = 'msg - '
            const count = 100
            let sendCount = 0
            let receivedCount = 0
            let startTime
            let counter = 0

            sub1 = (msg) => {
              console.log("vmx: message:", receivedCount)
              // go-ipfs can't send messages in order when there are
              // only two nodes in the same machine ¯\_(ツ)_/¯
              // https://github.com/ipfs/js-ipfs-api/pull/493#issuecomment-289499943
              // const expectedMsg = msgBase + receivedCount
              // const receivedMsg = msg.data.toString()
              // expect(receivedMsg).to.eql(expectedMsg)

              receivedCount++

              if (receivedCount >= count) {
                const duration = new Date().getTime() - startTime
                const opsPerSec = Math.floor(count / (duration / 1000))

                console.log(`Send/Receive 10k messages took: ${duration} ms, ${opsPerSec} ops / s\n`)

                check()
              }
            }

            sub2 = (msg) => {}

            function check () {
              if (++counter === 2) {
                done()
              }
            }

            series([
              (cb) => ipfs1.pubsub.subscribe(topic, sub1, cb),
              (cb) => ipfs2.pubsub.subscribe(topic, sub2, cb),
              (cb) => waitForPeers(ipfs1, topic, [ipfs2.peerId.id], cb)
            ], (err) => {
              expect(err).to.not.exist()
              startTime = new Date().getTime()

              whilst(
                () => sendCount < count,
                (cb) => {
                  const msgData = Buffer.from(msgBase + sendCount)
                  sendCount++
                  ipfs2.pubsub.publish(topic, msgData, cb)
                },
                check
              )
            })
          })
        })


      })
    })
  })
}
@whyrusleeping
Copy link
Member

@vmx is this only on windows? And not any other OS? It seems very fishy that this (without touching the CLI) would behave any differently than any other system.

@vmx
Copy link
Member Author

vmx commented Mar 7, 2018

@whyrusleeping I found looking into this issue as CI was failing a lot (https://ci.ipfs.team/blue/organizations/jenkins/IPFS%2Fjs-ipfs-api/detail/PR-705/4/tests) on this case on Windows. I was then able to reproduce it in a Windows VM easily. Though it seems to be a race condition, I'd say it happens in 60% of the cases. So far I haven't seen this failure on Linux.

It also seems that the more items are, the more likely it happens. I was able to reproduce it also with 10 items, but then it passed more often.

@vmx
Copy link
Member Author

vmx commented Mar 7, 2018

There's already some discussion about it at ipfs-inactive/interface-js-ipfs-core#188.

@djdv djdv mentioned this issue Mar 12, 2018
9 tasks
@djdv
Copy link
Contributor

djdv commented Mar 28, 2018

@vmx
I need some clarification, did it turn out that this was not Windows specific and does libp2p/go-libp2p-pubsub#53 resolve the issue?

@vmx
Copy link
Member Author

vmx commented Mar 28, 2018

@djdv djdv self-assigned this May 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants