About the DNN program: The program will be stop prematurely when change the bunchsize.
There is a bug in the original program and you should modify the code. Please use the new code in replace of the original code at line 490 in dnn_helper.cpp file.
the old code:
the new code:
How much is it to compete in the ASC Student Supercomputer Challenge?
ASC is free to get in! Plus, we provide supercomputing systems so you don't have to worry about finding a sponsor. You'll also have access to HPC classes, training materials, and a tour in a supercomputing center -- all complimentary. If your team makes the Top 16, we can even cover housing and meals. We can provide supporting paperwork if you need to look for funding for transportation.
Some teams run their program on the compile nodes, makes the compiling node very slow. Please help!
It is forbidden to run hpcg, DNN or masnum on the compile nodes. Once we found the users don’t obey this rules, the monitoring script will kill the test program at once. The test program should be submitted to the cluster by pbs script.
The first question about design HPC system, it require based on the Inspur NF5280M4. But our team didn’t have NF5280M4, so can we design the cluster and power evaluation according to the theoretical.
The clustering scheme design should be based on Inspur NF5280M4. But it is not demand you to actually build it. Your clustering scheme design should satisfy the requirement and reasonable, theoretical analysis correct, and highlight the design bright spot.
How to download the ASC training materials form the platform?
Training materials was stored in /home/public/training on the platform. You can use xftp tool to download them. Or login a machine which can access the internet, and run the command:
scp email@example.com:/home/public/training/ * destination
What kinds of files should the submission of MASNUM include？
The submission of MASNUM should include：
1. The results files of two examples for the validation. (pac_ncep_wav_20090228.nc for exp1, and global_ncep_wav_20090228.nc for exp2)
2. If “Compare Success” is printed on screen, it indicates the validation is pass. Plase add the screen shots into your proposal.
3. Optimized source codes of MASNUM.
I'm trying to optimize masnum_wam application, can I change the parameters (including CISTIME, CIETIME, COOLS_DAYS, DELTTM, WNDFREQ, WNDTYPE, OUTFLAG, WIOFREQ, CIOFREQ, RSTFREQ) in file ctlparams under directory 'exp'?
You can only modify the parameters related to the parallel setting. Other modifications of the workloads are prohibited.
And the team needs to pass the correctness checking of each workload.
In the MASNUM output files, there are some restart files. I've found little information in the MASNUM document, and didn't know how to use these restart files.
Can you give us some tips or help for using these restart files?
The tips for restart files are in "userguide" PDF document.
Do we need to store all the output files (from the beginning day to the end day) or we can just ignore some days and output the end day?
Because in the control parameter files (ctlparams), there are many parameters that we can tune. Some parameters are important for the simulation, like istime, ietime. Some parameters may change the IO frequency, like outflag, wiofreq....
So can we change the output frequency for some output file?
It's not allowed to change the output frequency related parameters. So you should store all the output files in your own test platform for the results validation.
What’s the problem about the calling of MKL?
If you call the MKL based on MIC, please compile the code on the MIC node, not in the management code.
I would like to ask whether we can modify the architecture of the DNN. Can we reduce the number of neurons in each layer or the number of layers?
The team members can’t modify the architecture of the DNN and can’t reduce the number of neurons in each layer or the number of layers.
We need the license to be able to import the data in Teye application. So where to get the teye license?
You should provide some info, such as name, email, request reason, MAC address. You only have one-month trial period. And we only provide technical support for ASC. So we fail to ask the questions which you met during the software installation.
We want to use the tool vtune to test the DNN hotspot. Need we to install it?
Answer: We have install intel vtune on the platform. The path is: /opt/intel/vtune_amplifier_xe_2015/bin64
Do you have the PPTs which were displayed during the Beijing training camp? We have the hard copies but we do need the electronic ones.
You can download them from the platform: asc16.inspur.com. The directory is /home/public/training
How can I get the Intel VTune AmplifierXE?
The path of VTune AmplifierXE is
/opt/intel/vtune_amplifier_xe_2015, you can use it.
Are we supposed to design our HPC using hardware and create an actual HPC system or just the requirements for it in a simulation or something like that?
And also in the HPCG test and WASNUM-WAM test it mentions running the tests in our own hardware; that part is also not very clear. Could you kindly shed some light on this part?
About the 1st question design your HPC system，you should design a HPC programme in the 3000W limit using the hardware list in the table，a programme or a design scheme is OK.
For the question HPCG test and WASNUM-WAM test, We suggest that you can test these questions on your own hardware, because the resource of the remote platform is not sufficient，and we pay more attention to the analysis process ,not the result.
When using your remote platform to compile my DNN source codes such problem occurred:
On our machine (CentOS 7.1, ICC 2016, Xeon Phi 31S1) my source codes can be compiled without any problem. Can you please check your environments or tell me how to fix this problem?
There is no MIC installation in node cm2，Please login the node with a MIC environment using the command 'ssh mc1' to finish your compiling job.
While compile a simple MIC program, it prompts the source file could not be opened. Error info is as follows:
$ icc -o MIC pi_mic.c -openmp
icc: warning #10362: Environment configuration problem encountered. Please check for proper MPSS installation and environment setup.
pi_mic.c(1): catastrophic error: *MIC* cannot open source file "stdio.h"
Please verify whether you do MIC compile on the node mc1. The login node cm1-cm4 didn't provide MPSS environment.
Can you give me PGI and XL FORTRAN compiler path?
The platform only provide Intel compiler. If you need another compiler, you can install them by yourself.
How to get the training material?
You can get them from the /home/public/training.
In the DNN application, can we modify the weight decay and over-fitting?
In the preliminary contest, the degree of parallelism is not very high, so the algorithm is not allowed to be modified.
The size of bunch is 1024, which can be modified?
The size of bunch is not allowed to be changed; however, the size of bunch can be split to compute.
In the program, the computing must be carried out according to the value of whole bunchsize which can't be split, all right?
The bunchsize represents the size of the batch, inside, the bunchsize can be split and do the calculation.
Do you have a manual of MKL for MIC? Can you send me one? I simply use MKL function according to the example of MIC, it hints me my parameter is wrong.
Please see the file called index.htm under the installation directory：/opt/intel/composer_xe_2015.0.090/Documentation/en_US/mkl/mklman
I want to ask questions about HPCG configuration. How to write HPCG configuration files? Is there a template?
Download HPCG package from http://www.hpcg-benchmark.org/software/index.html, then uncompressing it. The file INSTALL gave detailed introduction on Configure, Build, and Testing. Please read carefully.
What are the allowed optimizations HPCG? Can we modify the code?
In order to get the optimality HPCG results, you can modify the code. We have no special restrictions. It is recommended that you read the article < HPCG-Specification> chart 6 "PERMITTED TRANSFORMATIONS AND OPTIMIZATIONS". You need to follow these rules. Get this article from:
Wenjing Lv & Weiwei Wang
Hybrid Comp Inquiry