Welcome, Guest
Username: Password: Remember me
HELPDESK

Here we can describe more what should be posted here

TOPIC: Interpolation err at Prefetch_boundaries from MARS

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2494

When starting a normal run with 43h2.1.1, I get an "interpolation error" at Prefetch_boundaries. Please see attached log

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2495

Apparently the Log is not attached. This is the relevent extract from Log.


RETRIEVE,
CLASS = OD,
TYPE = AN,
STREAM = SCDA,
EXPVER = 0001,
REPRES = SH,
LEVTYPE = SFC,
PARAM = 31/34/43,
DATE = 20120927,
TIME = 0600,
STEP = 000,
DOMAIN = G,
TARGET = "dummy",
RESOL = AUTO,
ACCURACY = 16,
AREA = 7.9/-9.3/-7.6/9.6,
ROTATION = -50.0/-4.5,
GRID = 0.15/0.15,
PROCESS = LOCAL

mars - INFO - 20210305.160203 - Requesting 3 fields
mars - INFO - 20210305.160203 - FDB home /home/ma/fdbprod
mars - INFO - 20210305.160203 - FDB home /home/ma/fdbbc
67324 FDB; INFO; DB$_ Fields DataBase

mars - INFO - 20210305.160203 - Calling mars on 'marsod', local port is 60061
mars - INFO - 20210305.160203 - Callback at address 10.144.1.163, port 60061
mars - INFO - 20210305.160359 - Mars client is on nid00416 (10.144.1.163) 60061
mars - INFO - 20210305.160359 - Mars server is on dhs0232.ecmwf.int (10.3.2.32) 55660
mars - INFO - 20210305.160359 - Server task is 607 [marsod]
mars - INFO - 20210305.160359 - Request cost: 3 fields, 10.4108 Mbytes online, nodes: mvr08 [marsod]
mars - INFO - 20210305.160359 - The efficiency of your requests in the last 12 hours is 100% [marsod]
mars - INFO - 20210305.160359 - Transfering 10916500 bytes
mars - WARN - 20210305.160400 - CACHE-MANAGER mir/weights, /lus/snx11062/cache/20201117 does not exist
mars - INFO - 20210305.160400 - Cache file /lus/snx11207/cache/20201117/mir/weights/15/linear/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-cccf03137890302bd35031132685012b.mat does not exist
mars - INFO - 20210305.160401 - Creating cache file /lus/snx11207/cache/20201117/mir/weights/15/linear/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-cccf03137890302bd35031132685012b.mat
mars - INFO - 20210305.160401 - CacheManager creating file /lus/snx11207/cache/20201117/mir/weights/15/linear/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-cccf03137890302bd35031132685012b.mat
mars - INFO - 20210305.160404 - Cache file /lus/snx11207/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat does not exist
mars - INFO - 20210305.160404 - Creating cache file /lus/snx11207/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - INFO - 20210305.160404 - CacheManager creating file /lus/snx11207/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - ERROR - 20210305.160404 - Exception: Failed system call: utime(path_.c_str(), ×) in (/scratch/ma/deploy/metabuilds/ecflow-metab_5062/cca/GNU.73/mars_client/mars_client/eckit/src/eckit/filesystem/LocalPathName.cc +707 touch) (Operation not permitted)
mars - ERROR - 20210305.160404 - Error creating cache file: /lus/snx11207/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat (Failed system call: utime(path_.c_str(), ×) in (/scratch/ma/deploy/metabuilds/ecflow-metab_5062/cca/GNU.73/mars_client/mars_client/eckit/src/eckit/filesystem/LocalPathName.cc +707 touch) (Operation not permitted))
mars - INFO - 20210305.160404 - Cache file /lus/snx11209/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat does not exist
mars - INFO - 20210305.160404 - Creating cache file /lus/snx11209/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - INFO - 20210305.160404 - CacheManager creating file /lus/snx11209/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - ERROR - 20210305.160404 - Exception: Failed system call: utime(path_.c_str(), ×) in (/scratch/ma/deploy/metabuilds/ecflow-metab_5062/cca/GNU.73/mars_client/mars_client/eckit/src/eckit/filesystem/LocalPathName.cc +707 touch) (Operation not permitted)
mars - ERROR - 20210305.160404 - Error creating cache file: /lus/snx11209/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat (Failed system call: utime(path_.c_str(), ×) in (/scratch/ma/deploy/metabuilds/ecflow-metab_5062/cca/GNU.73/mars_client/mars_client/eckit/src/eckit/filesystem/LocalPathName.cc +707 touch) (Operation not permitted))
mars - INFO - 20210305.160404 - Cache file /lus/snx11208/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat does not exist
mars - INFO - 20210305.160404 - Creating cache file /lus/snx11208/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - INFO - 20210305.160404 - CacheManager creating file /lus/snx11208/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - ERROR - 20210305.160404 - Exception: Failed system call: utime(path_.c_str(), ×) in (/scratch/ma/deploy/metabuilds/ecflow-metab_5062/cca/GNU.73/mars_client/mars_client/eckit/src/eckit/filesystem/LocalPathName.cc +707 touch) (Operation not permitted)
mars - ERROR - 20210305.160404 - Error creating cache file: /lus/snx11208/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat (Failed system call: utime(path_.c_str(), ×) in (/scratch/ma/deploy/metabuilds/ecflow-metab_5062/cca/GNU.73/mars_client/mars_client/eckit/src/eckit/filesystem/LocalPathName.cc +707 touch) (Operation not permitted))
mars - ERROR - 20210305.160404 - Exception: UserError: CacheManager cannot create key=nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a, tried: /ec_coeff/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat, /lus/snx11207/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat, /lus/snx11209/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat, /lus/snx11208/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - ERROR - 20210305.160404 - MIR: UserError: CacheManager cannot create key=nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a, tried: /ec_coeff/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat, /lus/snx11207/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat, /lus/snx11209/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat, /lus/snx11208/cache/20201117/mir/weights/15/nearest-neighbour/R640-296ba3f6fbb8b96514a92b95da48e127-89.8924:0:-89.8924:359.859/LL-0.15x0.15-7.85:-9.3:-7.6:9.6-rot:-50:-4.5:0-06caa329921a265e30ee321bac3a389a.mat
mars - ERROR - 20210305.160404 - Interpolation failed (-2)
mars - WARN - 20210305.160404 - Visiting database marsod : expected 3, got 2
mars - INFO - 20210305.160404 - 2 fields have been interpolated on 'ccappn017'
mars - ERROR - 20210305.160404 - Expected 3, got 2.
mars - ERROR - 20210305.160404 - Request failed
mars - INFO - 20210305.160404 - Request time: wall: 2 min 1 sec cpu: 3 sec
mars - INFO - 20210305.160404 - Read from network: 10.41 Mbyte(s) in < 1 sec [71.77 Mbyte/sec]
mars - INFO - 20210305.160404 - Processing in marsod: wall: 1 min 56 sec
mars - INFO - 20210305.160404 - Visiting marsod: wall: 2 min 1 sec
mars - INFO - 20210305.160404 - Post-processing: wall: 5 sec cpu: 3 sec
mars - INFO - 20210305.160404 - Writing to target file: 32.81 Kbyte(s) in < 1 sec [5.95 Mbyte/sec]
mars - INFO - 20210305.160408 - Memory used: 980.21 Mbyte(s)
mars - ERROR - 20210305.160408 - Some errors reported

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2496

  • Ulf Andrae
  • Ulf Andrae's Avatar
  • OFFLINE
  • Administrator
  • Posts: 308
  • Thank you received: 35
Carlos,

The domain specification is OK as far as I can see. It looks like an internal MARS problem. Does it fail for a recent date as well?

Ulf
The following user(s) said Thank You: Carlos Geijo Guerrero

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2497

Ulf,

I do not know with more recent dates, but these dates were OK with cy40 just weeks ago...

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2498

  • Ulf Andrae
  • Ulf Andrae's Avatar
  • OFFLINE
  • Administrator
  • Posts: 308
  • Thank you received: 35
Carlos,

In CY40 we extract data in pure latlon whereas in CY43 it done in rotated latlon.

Ulf

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2499

  • Bert van Ulft
  • Bert van Ulft's Avatar
  • OFFLINE
  • Expert Boarder
  • Posts: 109
  • Thank you received: 22
Hi Carlos,

Oskar Landgren and I experienced the same error with HCLIM43 last week, but this morning it seemed to be working again. Could you test if your Prefetch_boundaries also works again?

Bert

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2500

Hello Bert,

Few minutes before your message I had just submitted the run with ROTATED_MARS_FIELS=no in Gen_domain_hdr. It is very well possible that the problem has been fixed since last week. My failed experiment was submitted last friday. I'll try the other after this try is finished.

Carlos

Interpolation err at Prefetch_boundaries from MARS 10 months 2 weeks ago #2501

Hello again,

Yes, the rotated extraction has now worked well. Incidentally, my modification in Gen_domain_hdr had no effect as the ecf domain description container had been generated in the failed run and picked up in this new try.

Carlos

Interpolation err at Prefetch_boundaries from MARS 9 months 3 weeks ago #2502

I think I have similar very strange error now using for the first time 43h2.1.1
My experiment fails with MARS_stage_bd with the following error:
==============
cray-snplauncher/7.5.3 default 2017/05/17 13:11:08
TASKS=
[mpiexec@ccappn018] set_default_values (/notbackedup/tmp/ulib/mpt_base/mpich2/src/pm/hydra/ui/mpich/utils.c:1563): no executable provided
[mpiexec@ccappn018] HYD_uii_mpx_get_parameters (/notbackedup/tmp/ulib/mpt_base/mpich2/src/pm/hydra/ui/mpich/utils.c:1774): setting default values failed
[mpiexec@ccappn018] main (/notbackedup/tmp/ulib/mpt_base/mpich2/src/pm/hydra/ui/mpich/mpiexec.c:163): error parsing parameters
MARS returned 255 on trial=1
TASKS=
[mpiexec@ccappn018] set_default_values (/notbackedup/tmp/ulib/mpt_base/mpich2/src/pm/hydra/ui/mpich/utils.c:1563): no executable provided
[mpiexec@ccappn018] HYD_uii_mpx_get_parameters (/notbackedup/tmp/ulib/mpt_base/mpich2/src/pm/hydra/ui/mpich/utils.c:1774): setting default values failed
[mpiexec@ccappn018] main (/notbackedup/tmp/ulib/mpt_base/mpich2/src/pm/hydra/ui/mpich/mpiexec.c:163): error parsing parameters
MARS returned 255 on trial=2
MARS STAGE failed
ERROR:ECF_ABORT_HM
==================
Would be very happy if someone can explain how to get rid of this. It'd be pity to go back to 43h2.1 just because this is problematic...

I have other experiment with 43h2.1 running without any problem with different period, though.

Thanks in advance,
Roger

Interpolation err at Prefetch_boundaries from MARS 9 months 3 weeks ago #2503

  • Ulf Andrae
  • Ulf Andrae's Avatar
  • OFFLINE
  • Administrator
  • Posts: 308
  • Thank you received: 35
Roger,

What's the experiment name?

Ulf

Interpolation err at Prefetch_boundaries from MARS 9 months 3 weeks ago #2504

Ulf,

T43HRRSN uses 43h2.1
T43ATWCN uses 43h2.1.1

Thanks in advance for your help,
Roger

Interpolation err at Prefetch_boundaries from MARS 9 months 3 weeks ago #2505

  • Ulf Andrae
  • Ulf Andrae's Avatar
  • OFFLINE
  • Administrator
  • Posts: 308
  • Thank you received: 35
Roger,

Your experiment is corrupted. You don't have a local version of e.g. suites/harmonie.tdf but if you do

diff /scratch/ms/no/sbt/hm_home/T43ATWCN/lib/suites/harmonie.tdf ~/harmonie_release/git/tags/harmonie-43h2.1.1/suites/harmonie.tdf

you'll see that you have loads of differences. Could it be that you changed version in an existing experiment?

Time to start a new experiment or a careful cleaning.

Ulf

Interpolation err at Prefetch_boundaries from MARS 9 months 3 weeks ago #2506

Ulf,

Sounds like I've mixed two versions. I do have a guess about what could have happened. I'm creating a (completely) new experiment now.

Thanks a lot for your help.

Roger
Time to create page: 0.087 seconds