Discussion:
[psad-discuss] PSAD randomly crashes
Rinck Sonnenberg
2016-06-27 19:55:38 UTC
Permalink
Hi,

I've been running PSAD on around 20 servers for a while now and lately PSAD
crashes randomly on almost all of these 20 servers. The log file shows
nothing in particular (attached), except for a (re)start.

The crash happens almost daily on any 1 server and the log always shows the
same. The configuration is the exact same on all servers (managed by
puppet). Puppet always restarts the PSAD service on each machine
successfully (no manual intervention required).

However, I would like to understand why it crashes in the first place.
Attached are my config file and answer file used to install PSAD. I'm using
version:

***@vps:/# psad -V
[+] psad v2.4.3 by Michael Rash <***@cipherdyne.org>

I do see a bunch of notifications in the errs/psad.die file:

Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.

These also show up in the psad.warn file:

Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.

But they don't necessarily correspond to the time/date of the crash. The
attached logfile shows a restart for today, but no message appears in the
warn or die log.

Any clue as to what is wrong? Is this a configuration error? Or am I
encountering some sort of bug?

All my servers are running a completely up-to-date version of Ubuntu 14.04
server LTS:

***@vps:/# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty

Any help is very much appreciated!

Regards,
Rinck
Michael Rash
2016-06-28 03:09:44 UTC
Permalink
Post by Rinck Sonnenberg
Hi,
Hello Rinck,
Post by Rinck Sonnenberg
I've been running PSAD on around 20 servers for a while now and lately
PSAD crashes randomly on almost all of these 20 servers. The log file shows
nothing in particular (attached), except for a (re)start.
The crash happens almost daily on any 1 server and the log always shows
the same. The configuration is the exact same on all servers (managed by
puppet). Puppet always restarts the PSAD service on each machine
successfully (no manual intervention required).
However, I would like to understand why it crashes in the first place.
Attached are my config file and answer file used to install PSAD. I'm using
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
But they don't necessarily correspond to the time/date of the crash. The
attached logfile shows a restart for today, but no message appears in the
warn or die log.
Any clue as to what is wrong? Is this a configuration error? Or am I
encountering some sort of bug?
This looks to me as though the whols lookups are taking a long time to
complete because the alarms are being triggered. It is conceivable that
with tons of whois processes that are hanging that this is exposing an
issue. By default, psad will cache whois lookup data according to the
WHOIS_LOOKUP_THRESHOLD
variable, but on very busy systems this threshold may be too low.

As a test, could you try disabling whois lookups altogether on one of the
systems were restarts are consistent? (Set ENABLE_WHOIS_LOOKUPS to N.) If
this seems to fix the restart issue, then we know where the culprit is.

Thanks,

--Mike
Post by Rinck Sonnenberg
All my servers are running a completely up-to-date version of Ubuntu 14.04
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty
Any help is very much appreciated!
Regards,
Rinck
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
psad-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/psad-discuss
--
Michael Rash | Founder
http://www.cipherdyne.org/
Key fingerprint = 53EA 13EA 472E 3771 894F AC69 95D8 5D6B A742 839F
Rinck Sonnenberg
2016-06-28 07:49:41 UTC
Permalink
Hey Mike,

thanks for your reply! I've updated the settings (and added a few that were
missing from my config files) and will monitor the service for a couple of
days. I will let you know what my findings are!

- Rinck



Met vriendelijke groet,
Rinck H. Sonnenberg
Netson Internet Oplossingen

____________________________
*Netson Internet Oplossingen*
Beatrixlaan 2
3881 ME Putten (NL)
KvK: 14092230
BTW: NL1934.08.934.B01
*IBAN: **NL02 RABO 0106 1096 77 (nieuw)*
SWIFT: RABONL2U

E: ***@netson.nl
T: +31(0)653460801
W: www.netson.nl
Post by Michael Rash
Post by Rinck Sonnenberg
Hi,
Hello Rinck,
Post by Rinck Sonnenberg
I've been running PSAD on around 20 servers for a while now and lately
PSAD crashes randomly on almost all of these 20 servers. The log file shows
nothing in particular (attached), except for a (re)start.
The crash happens almost daily on any 1 server and the log always shows
the same. The configuration is the exact same on all servers (managed by
puppet). Puppet always restarts the PSAD service on each machine
successfully (no manual intervention required).
However, I would like to understand why it crashes in the first place.
Attached are my config file and answer file used to install PSAD. I'm using
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
But they don't necessarily correspond to the time/date of the crash. The
attached logfile shows a restart for today, but no message appears in the
warn or die log.
Any clue as to what is wrong? Is this a configuration error? Or am I
encountering some sort of bug?
This looks to me as though the whols lookups are taking a long time to
complete because the alarms are being triggered. It is conceivable that
with tons of whois processes that are hanging that this is exposing an
issue. By default, psad will cache whois lookup data according to the
WHOIS_LOOKUP_THRESHOLD variable, but on very busy systems this threshold
may be too low.
As a test, could you try disabling whois lookups altogether on one of the
systems were restarts are consistent? (Set ENABLE_WHOIS_LOOKUPS to N.) If
this seems to fix the restart issue, then we know where the culprit is.
Thanks,
--Mike
Post by Rinck Sonnenberg
All my servers are running a completely up-to-date version of Ubuntu
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty
Any help is very much appreciated!
Regards,
Rinck
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
psad-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/psad-discuss
--
Michael Rash | Founder
http://www.cipherdyne.org/
Key fingerprint = 53EA 13EA 472E 3771 894F AC69 95D8 5D6B A742 839F
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
psad-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/psad-discuss
Rinck Sonnenberg
2016-06-28 11:53:22 UTC
Permalink
Hi Mike,

unfortunately PSAD just stopped and started again on of my servers. I just
verified that the ENABLE_DNS_LOOKUPS and ENABLE_WHOIS_LOOKUPS are both set
to N, which they are. I did notice the error messages in the psad.die and
psad.warn logs have disappeared since I disable the WHOIS lookups, but the
crashing continues. Any additional suggestions?

Thanks for all your help so far!

- Rinck
Post by Michael Rash
Post by Rinck Sonnenberg
Hi,
Hello Rinck,
Post by Rinck Sonnenberg
I've been running PSAD on around 20 servers for a while now and lately
PSAD crashes randomly on almost all of these 20 servers. The log file shows
nothing in particular (attached), except for a (re)start.
The crash happens almost daily on any 1 server and the log always shows
the same. The configuration is the exact same on all servers (managed by
puppet). Puppet always restarts the PSAD service on each machine
successfully (no manual intervention required).
However, I would like to understand why it crashes in the first place.
Attached are my config file and answer file used to install PSAD. I'm using
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
But they don't necessarily correspond to the time/date of the crash. The
attached logfile shows a restart for today, but no message appears in the
warn or die log.
Any clue as to what is wrong? Is this a configuration error? Or am I
encountering some sort of bug?
This looks to me as though the whols lookups are taking a long time to
complete because the alarms are being triggered. It is conceivable that
with tons of whois processes that are hanging that this is exposing an
issue. By default, psad will cache whois lookup data according to the
WHOIS_LOOKUP_THRESHOLD variable, but on very busy systems this threshold
may be too low.
As a test, could you try disabling whois lookups altogether on one of the
systems were restarts are consistent? (Set ENABLE_WHOIS_LOOKUPS to N.) If
this seems to fix the restart issue, then we know where the culprit is.
Thanks,
--Mike
Post by Rinck Sonnenberg
All my servers are running a completely up-to-date version of Ubuntu
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty
Any help is very much appreciated!
Regards,
Rinck
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
psad-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/psad-discuss
--
Michael Rash | Founder
http://www.cipherdyne.org/
Key fingerprint = 53EA 13EA 472E 3771 894F AC69 95D8 5D6B A742 839F
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
psad-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/psad-discuss
Michael Rash
2016-06-28 12:27:29 UTC
Permalink
Post by Rinck Sonnenberg
Hi Mike,
unfortunately PSAD just stopped and started again on of my servers. I just
verified that the ENABLE_DNS_LOOKUPS and ENABLE_WHOIS_LOOKUPS are both set
to N, which they are. I did notice the error messages in the psad.die and
psad.warn logs have disappeared since I disable the WHOIS lookups, but the
crashing continues. Any additional suggestions?
Hmm, ok. Are there any syslog messages that look suspicious - including
kern.log or dmesg? For example, is there an AppArmor policy deployed that
is killing psad for some reason? Are the systems consistently running on
very low memory?

Depending on the amount of data, could you bzip2 compress your iptables
logs around the time of the crash and send them to me (assuming you are ok
with this)? I'd like to try and reproduce the crash you are seeing.

Thanks,

--Mike
Post by Rinck Sonnenberg
Thanks for all your help so far!
- Rinck
Post by Michael Rash
Post by Rinck Sonnenberg
Hi,
Hello Rinck,
Post by Rinck Sonnenberg
I've been running PSAD on around 20 servers for a while now and lately
PSAD crashes randomly on almost all of these 20 servers. The log file shows
nothing in particular (attached), except for a (re)start.
The crash happens almost daily on any 1 server and the log always shows
the same. The configuration is the exact same on all servers (managed by
puppet). Puppet always restarts the PSAD service on each machine
successfully (no manual intervention required).
However, I would like to understand why it crashes in the first place.
Attached are my config file and answer file used to install PSAD. I'm using
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
Sat Jun 11 18:22:46 2016 psad v2.4.3 pid: 21328 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 1275.
Tue Jun 21 12:04:12 2016 psad v2.4.3 pid: 30888 whois alarm at
/usr/sbin/psad line 7397, <$fwdata_fh> line 854.
But they don't necessarily correspond to the time/date of the crash. The
attached logfile shows a restart for today, but no message appears in the
warn or die log.
Any clue as to what is wrong? Is this a configuration error? Or am I
encountering some sort of bug?
This looks to me as though the whols lookups are taking a long time to
complete because the alarms are being triggered. It is conceivable that
with tons of whois processes that are hanging that this is exposing an
issue. By default, psad will cache whois lookup data according to the
WHOIS_LOOKUP_THRESHOLD variable, but on very busy systems this threshold
may be too low.
As a test, could you try disabling whois lookups altogether on one of the
systems were restarts are consistent? (Set ENABLE_WHOIS_LOOKUPS to N.)
If this seems to fix the restart issue, then we know where the culprit is.
Thanks,
--Mike
Post by Rinck Sonnenberg
All my servers are running a completely up-to-date version of Ubuntu
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty
Any help is very much appreciated!
Regards,
Rinck
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
psad-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/psad-discuss
--
Michael Rash | Founder
http://www.cipherdyne.org/
Key fingerprint = 53EA 13EA 472E 3771 894F AC69 95D8 5D6B A742 839F
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
psad-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/psad-discuss
--
Michael Rash | Founder
http://www.cipherdyne.org/
Key fingerprint = 53EA 13EA 472E 3771 894F AC69 95D8 5D6B A742 839F
Loading...