Ewon flipping online/offline

I am having issues connecting to a couple of my Flexy’s

MOVED TO STAFF NOTE

Looking at my ecatcher, the top 4 Flexy’s are the ones I’m having issues with. Ecatcher shows that these Flexys are online but when I try to actually connect and access the “index.shtm” page I see that the site can not be reached. I am able to connect to Flexys at other locations through this ecatcher.

When I look into the ecatcher logs for these four Flexys, I have noticed that every couple minutes the Flexys will toggle between online and offline.

My IT team has looked at the network in this location and there does not appear to be any issues. I have also ruled out power loss.

Have you experienced this issue before? What would be the recommended trouble shooting steps to move forward?

Hi garyR,

Are all four of these Flexys at the same location? Could you try to capture a backup with support files of one of them? You may be able to do this remotely as it looks like they’re staying online for a couple of minutes each cycle, but if not, someone local can connect a laptop with eBuddy on it to one of the Ewon’s LAN ports. Once connected (remotely or locally), open eBuddy, select the Ewon device you’d like help with, and click “Backup/Restore”. Make sure to check “include support files”, then click through to create a backup.

If you don’t have it, you can download eBuddy here: Ewon Technical Support - All Downloads

This backup will let us see the device’s logs and configuration which should give us a better view at what’s happening. It’s hard to say without seeing the error messages being logged, but one possibility that comes to my mind is interference from something like a VFD running every few minutes, which could disrupt the devices’ connections enough to kick them offline while it’s running.

Best regards,
Hugh

Hey Hugh,

Yes, all flexys are in the same room but on different lines. I’ve attached two of log files retrieved locally trough ebuddy.

I did notice that even when connected locally, the device would become “unreachable” after a couple minutes though I didn’t ever notice the power light going out. All the flexys are also on separate power supplies and had been functioning properly up until very recently (Monday) and we haven’t had any power issues with other machines in there.

One other thing I noticed is that even when I am able to connect briefly, everything seems to be running much slower than usual

MOVED TO STAFF NOTE

Hi GaryR,

We’re still trying to work out what’s causing your devices to go offline. We can tell from the logs that they are rebooting very frequently, every few minutes, and it seems to be related to the scripts that they’re running. Our best guess is that something is interfering with their connection and preventing them from exporting the data they’re collecting to AWS, which builds up until they can’t buffer any more and are forced to reboot.

Unfortunately it’s not totally clear what’s causing this backlog to form. There are is one logged event indicating interference detected on one of the Ewon’s Ethernet ports in each of the backups, so that might be a factor. How long are the Ethernet runs connecting the Ewons to the Internet? Are they shielded? Do they run by any high voltage equipment?

Best regards,
Hugh

Not sure if this will help or not, but I had an issue similar to this when commissioning Flexy205s with cellular cards as the main connection method. The issue ended up being that I’d named two of them the same name or that I’d slightly altered the name in the unit vs. the name provided in eCatcher. You may triple check your IDs on both sides, just in case.

The other option would be to disable the Basic Script and see if the unit stays online. The error handling and catch is not the greatest. An error in the script, even a minor one can cause reboots. Also have experienced reboot issues with multiple ONCHANGE calls in the beginning of the basic script. Had to put those in a separate function that was called on timer after the unit booted.

Interesting, I’m wondering if I’d even be able to disable those scripts. It’s been very difficult to login to the device for even a few seconds without it crashing.

Cables are not shielded and the runs are fairly long from our switch to the device, I’d estimate about 40 ft. They run by some equipment but it is basic packaging stuff and hadn’t caused any issues in the past.

I agree with tedsch2 that if you’re able to log into the devices to stop the scripts, that would be a good thing to test. I also recommend changing the unshielded cables, as we did see messages related to interference in both of the backups.

I’m unable to connect long enough to stop the script. Should I just do a full reset on one of the devices? If I do a full device reset would I be able to load the old configuration from a back up file if we find out the issue is not related to the script or other configurations?

Yes, you could load the configuration from a backup file. Just note that it can’t be a backup with support files, only a normal backup with that option unchecked when you capture it.

To carry out a second level reset:

  • Power off the unit.
  • While powering it on, press & maintain the reset button. The LED labeled BI1 turns green.
  • Keep the reset button pressed for approximately 35 seconds until the USR LED remains red steady.
  • When this state is reached, release the button. The LED labeled BI1 turns off.
  • Check if the auto test is successful, the USR LED blinks red with a pattern of 200ms on and 1500ms off. The Ewon does NOT restart by itself in normal mode and remains running in this diagnostic mode.
  • Power off the Ewon and power it on again to reboot the unit in a normal mode. As described before, the Ewon returns to its default COM parameters and factory IP addresses (such as LAN 10.0.0.53) after this level 2 reset is performed.

Hey Hugh, I first moved the device to a new location off of our plant floor and set it up there. I saw the same issues here with short ethernet cables and around no high voltage equipment so I do not believe either of these could be a cause of the issue.

I did the full level 2 reset and was able to connect without issue however when I restored the device to the previous configuration/script and started the script I ran into the same issues again. I did another reset and this time just copied the MQTT portion of the script into the IDE and I have been able to run this without issues.

From this I think there must be some configuration issue? Have you been able to determine anything more from the log files? Is there any way I can do a partial restoration with all of the settings/Tag values but leave the script out to test if it is truly just the script causing problems? Thanks for the assistance.

Sounds to me like there is some error in the basic script. Could be you are missing an end statement or are looking at ONCHANGE for a tag that is not ready yet. There could be other issues but without looking at the script it would be a guessing game.

That’s kinda what I’m thinking, will probably have to go through it line by line.

If you need another set of eyes on it let me know.

I was able to connect and pull the backup before resetting the device though.

Hey Hugh,

Was able to get one of the Flexys back online and working. I did a backup, full reset, and restore then modified the script. I am now having issues with getting the second flexy back online, I attempted to follow the same steps with Flexy #2 but am running into an issue as I attempt to restore. I am seeing this error on ebuddy when I attempt a backup or restore on the fully reset Flexy -

MOVED TO STAFF NOTE

I am able to get connected when I choose the change IP tab though

MOVED TO STAFF NOTE

and when I select “open in browser” I see nothing in my browser and I get the same failed to connect error when I try to update firmware.

Glad to hear you got one of them online. Could you share the backup you managed to capture before resetting it? I’d like to see what it’s reporting.

I’m inclined to think the issue with connecting to it is related to timing. Sometimes with problems like this, where it seems like a script issue is overloading the device, it works inconsistently because there are windows of time where the script isn’t doing too much and the device works semi-normally, only for the status of the script to change and kick things offline again.

What was the process like with the working Flexy? Was it just a lot of trial and error until it happened to work? Or was there a change to the steps you were taking that finally worked, but isn’t working on the other device?

Hey Hugh,
MOVED TO STAFF NOTE (1.5 MB)
MOVED TO STAFF NOTE (1.5 MB)
MOVED TO STAFF NOTE (1.7 MB)

Here are the logs (before reset) for each of the flexys that have been recovered.

Once I did the backup and reset, I restored to the old configuration then opened the Basic IDE and removed the script before it started, I modified the script in a text file and slowly added pieces back, testing running the script to make sure I didn’t start jumping offline again. I’m not sure the exact line that was causing issues but I think it might be due to a tag value name that I changed but didn’t change all the references in the script (maybe?).

As a potential improvement, if a tag value name is changed, can all references to that tag in the script be automatically updated?

I resolved the ebuddy issue but I ended up having to restart my computer each time before connecting to a new Flexy, simply restarting the ebuddy app was not working. After a restart, ebuddy did the backup and restore without any issue but, again, I did have to restart my computer three separate times to get into each Flexy. Thanks for the support.