Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server freeze #1000

Closed
AndreFaramir opened this issue Oct 1, 2014 · 6 comments
Closed

Server freeze #1000

AndreFaramir opened this issue Oct 1, 2014 · 6 comments
Labels
needs-triage Needs testing with screenshot/video confirmation

Comments

@AndreFaramir
Copy link

Somehow the server freezes.

The way I use to reproduce is, login 2 clients, one adm and another player
step1, adm login, outside the pz
step2, player login, under pz
step3, adm summon trainer and then create 8 demons, demons start attacking trainer, sometime later the server just freeze

Observations:

  • Only happens if the player is watching, otherwise the server wont freeze
  • The demons must be attacking the trainer
  • It takes some time to freeze, for me took a maximum of 10 minutes
  • Server and clients are running on same computer

it is very weird cause the freeze is pointing to ServiceManager::run with a lot of gdb symbols i can't find what is going on to fix myself, very frustating....

gdb bt: (after kill, already freeze a lot of times, always point to same ServiceManager)
(gdb) bt

#0  0x00000000007ebf3b in boost::asio::detail::op_queue::push (this=0x7fff813c3750, q=...)
    at /usr/local/include/boost/asio/detail/op_queue.hpp:124
#1  0x00000000007ebbc1 in boost::asio::detail::timer_queue >::get_ready_timers (this=0x1ea9aa8, ops=...)
    at /usr/local/include/boost/asio/detail/timer_queue.hpp:166
#2  0x00000000009daf8d in boost::asio::detail::timer_queue_set::get_ready_timers (this=0x1ea9b38, ops=...) at /usr/local/include/boost/asio/detail/impl/timer_queue_set.ipp:88
#3  0x00000000009db386 in boost::asio::detail::epoll_reactor::run (this=0x1ea9af0, block=true, ops=...) at /usr/local/include/boost/asio/detail/impl/epoll_reactor.ipp:423
#4  0x00000000009db87a in boost::asio::detail::task_io_service::do_one (this=0x1ea9a00, lock=..., this_idle_thread=0x7fff813c37d0) at /usr/local/include/boost/asio/detail/impl/task_io_service.ipp:278
#5  0x00000000009db618 in boost::asio::detail::task_io_service::run (this=0x1ea9a00, ec=...) at /usr/local/include/boost/asio/detail/impl/task_io_service.ipp:131
#6  0x00000000009dbb37 in boost::asio::io_service::run (this=0x7fff813c38f0) at /usr/local/include/boost/asio/impl/io_service.ipp:57
#7  0x00000000009d98e3 in ServiceManager::run (this=0x7fff813c38c0) at /home/andre/Desktop/Projetos/forgottenserver/src/server.cpp:52
#8  0x00000000009528f5 in main (argc=1, argv=0x7fff813c3b38) at /home/andre/Desktop/Projetos/forgottenserver/src/otserv.cpp:114

going deep I find that the server hangs on

boost::asio::detail::timer_queue >::get_ready_timers (this=0x1ea9aa8, ops=...) at /usr/local/include/boost/asio/detail/timer_queue.hpp:167
167       remove_timer(*timer);
(gdb) n
163     while (!heap_.empty() && !Time_Traits::less_than(now, heap_[0].time_))
(gdb) n
165       per_timer_data* timer = heap_[0].timer_;
(gdb) n
166       ops.push(timer->op_queue_);
(gdb) n

it keeps doing this steps over and over again, until I kill the server, consuming 100% cpu

@marksamman marksamman changed the title Server freeze Server freeze Oct 1, 2014
@marksamman
Copy link
Member

Have you made any changes to the source code? Is anyone else able to reproduce this?

@AndreFaramir
Copy link
Author

everything is original, even the data, not a single line edited...

I just git cloned again and compiled the repository, last files and source, the bug still happening, the bug appears to have 3 steps

step1) http://i.imgur.com/ICswQRn.png
the server stop sending packets to client, in this image you can see the demon in different position on adm faramir client and faramir client, adm faramir client stop receiving packets, i try to move but it does not show it is moving, but on my faramir client i see the adm faramir client moving, in this step the server the server did not freeze yet but is showing something wrong on io_service is happening

step2) http://i.imgur.com/JYDDw3G.png
the server is now freeze, you can see the that the server is consuming 100% cpu

step3) http://i.imgur.com/bzCfE9R.png
server is now offline, cant connect and it will only shutdown after i kill it

#########################################

@djarek
Copy link
Contributor

djarek commented Oct 1, 2014

Could you give us a backtrace of all threads?
thread apply all bt

@AndreFaramir
Copy link
Author

sure

(gdb) thread apply all bt
Thread 3 (Thread 0x7f5e5c695700 (LWP 28011)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f5e5ddb566c in std::condition_variable::wait(std::unique_lock&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00000000009ee15c in Dispatcher::dispatcherThread (
    this=0xd64ae0 )
    at /home/andre/Desktop/Projetos/forgottenorg/src/tasks.cpp:54
#3  0x00000000009efdb7 in std::_Mem_fn::operator()<, void>(Dispatcher*) const (this=0xf62078, __object=0xd64ae0 )
    at /usr/include/c++/4.8/functional:601
#4  0x00000000009efd07 in std::_Bind_simple (Dispatcher*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=0xf62070)
    at /usr/include/c++/4.8/functional:1732
#5  0x00000000009efc0f in std::_Bind_simple (Dispatcher*)>::operator()() (this=0xf62070)
    at /usr/include/c++/4.8/functional:1720
#6  0x00000000009efba8 in std::thread::_Impl (Dispatcher*)> >::_M_run() (this=0xf62058)
    at /usr/include/c++/4.8/thread:115
#7  0x00007f5e5ddb8bf0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007f5e5cdf2182 in start_thread (arg=0x7f5e5c695700)
---Type  to continue, or q  to quit---
    at pthread_create.c:312
#9  0x00007f5e5d52030d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 2 (Thread 0x7f5e5be94700 (LWP 28012)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00000000009d0c68 in __gthread_cond_timedwait (
    __cond=0xd64b90 , __mutex=0xd64b68 , 
    __abs_timeout=0x7f5e5be93d60)
    at /usr/include/x86_64-linux-gnu/c++/4.8/bits/gthr-default.h:871
#2  0x00000000009d23c5 in std::condition_variable::__wait_until_impl > > (
    this=0xd64b90 , __lock=..., __atime=...)
    at /usr/include/c++/4.8/condition_variable:160
#3  0x00000000009d1bfd in std::condition_variable::wait_until > > (this=0xd64b90 , 
    __lock=..., __atime=...) at /usr/include/c++/4.8/condition_variable:100
#4  0x00000000009d0f58 in Scheduler::schedulerThread (
    this=0xd64b60 )
    at /home/andre/Desktop/Projetos/forgottenorg/src/scheduler.cpp:48
#5  0x00000000009d45f5 in std::_Mem_fn::operator()<, void>(Scheduler*) const (this=0xf62208, __object=0xd64b60 )
---Type  to continue, or q  to quit---
    at /usr/include/c++/4.8/functional:601
#6  0x00000000009d4545 in std::_Bind_simple (Scheduler*)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=0xf62200)
    at /usr/include/c++/4.8/functional:1732
#7  0x00000000009d444d in std::_Bind_simple (Scheduler*)>::operator()() (this=0xf62200)
    at /usr/include/c++/4.8/functional:1720
#8  0x00000000009d43e6 in std::thread::_Impl (Scheduler*)> >::_M_run() (this=0xf621e8)
    at /usr/include/c++/4.8/thread:115
#9  0x00007f5e5ddb8bf0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007f5e5cdf2182 in start_thread (arg=0x7f5e5be94700)
    at pthread_create.c:312
#11 0x00007f5e5d52030d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 1 (Thread 0x7f5e5ee14780 (LWP 28010)):
#0  0x00000000007dc26f in boost::asio::detail::op_queue::empty (this=0xf61a58)
    at /usr/local/include/boost/asio/detail/op_queue.hpp:139
#1  0x00000000009d6de9 in boost::asio::detail::task_io_service::do_one (
    this=0xf61a00, lock=..., this_idle_thread=0x7fffab8859a0)
    at /usr/local/include/boost/asio/detail/impl/task_io_service.ipp:248
---Type  to continue, or q  to quit---
#2  0x00000000009d6cda in boost::asio::detail::task_io_service::run (
    this=0xf61a00, ec=...)
    at /usr/local/include/boost/asio/detail/impl/task_io_service.ipp:131
#3  0x00000000009d71f9 in boost::asio::io_service::run (this=0x7fffab885ac0)
    at /usr/local/include/boost/asio/impl/io_service.ipp:57
#4  0x00000000009d4fa5 in ServiceManager::run (this=0x7fffab885a90)
    at /home/andre/Desktop/Projetos/forgottenorg/src/server.cpp:52
#5  0x000000000094eea9 in main (argc=1, argv=0x7fffab885d08)
    at /home/andre/Desktop/Projetos/forgottenorg/src/otserv.cpp:114

@marksamman marksamman added the needs-triage Needs testing with screenshot/video confirmation label Oct 1, 2014
@AndreFaramir
Copy link
Author

maybe the bug is being caused by my boost version ? which version of boost are you guys using !??

@dalkon
Copy link
Contributor

dalkon commented Oct 12, 2014

I can't reproduce it using the steps provided, I tried several times.

@dalkon dalkon closed this as completed Oct 12, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage Needs testing with screenshot/video confirmation
Projects
None yet
Development

No branches or pull requests

4 participants