使用强化学习的磁性软机器人的自适应驱动

首页
关于我们
产品中心

电磁铁

WDS系列推荐 SB系列 SBV系列钳式电磁铁推荐多极电磁铁推荐单极电磁铁交流电磁铁旋转电磁铁更多

亥姆霍兹线圈

一维亥姆霍兹线圈三维亥姆霍兹线圈推荐二维亥姆霍兹线圈三维等径线圈梯度磁场线圈磁通线圈定制非标线圈工频磁场干扰线圈/探测线圈/发射线圈更多

磁场线圈

螺线管线圈推荐巴克尔线圈三维屏蔽筒线圈方形亥姆霍兹线圈推荐大型磁场线圈加勒特螺线管磁约束线圈无矩线圈更多

永磁体

标准永磁体推荐核磁共振永磁体可调永磁铁海尔贝克永磁体推荐 YC系列永磁体旋转永磁铁单极可调磁场发生器轴向永磁体更多

测试/实验系统

振动样品磁强计推荐静态磁场主动屏蔽系统推荐动态磁场主动屏蔽系统磁场控制系统高精度振动样品磁强计推荐更多

励磁电源

YP系列直流电源推荐 YPD系列直流电源 HAP系列直流电源 HEA系列功率放大器更多

教学实验仪器

B-H磁滞回线实验仪毕奥萨伐尔实验仪磁致伸缩实验仪各向异性磁阻传感器与磁场测量实验仪霍尔效应实验仪金属电子逸出功实验仪居里温度实验仪连续波核磁共振实验仪更多

磁测量及应用产品

高斯计推荐磁通门计磁通计电磁铁冷水机推荐零磁腔体磁屏蔽筒更多
解决方案

热门案例

高精度振动样品磁强计三维交流线圈磁场控制系统三维线圈动态屏蔽系统英普2025新款振动样品磁强计VSM-T1常温旋转电磁铁控场系统更多

应用场景

使用强化学习的磁性软机器人的自适应驱动磁场作用下低共熔溶剂内的传输特性分析及以其为电解液的液流电池性能改善研究原位横向磁场辅助电弧定向能量沉积316L不锈钢成效显著振动样品磁强计应用于磁致伸缩材料研究更多
产品定制
联系我们

Figure 1. Numerical model of MSRs. a) Cosserat rod model abstracts a slender structure into a centerline curve and cross sections. Positional vector r is used to describe coordinates of nodes under the Eulerian coordinates. A triad of orthogonal vectors {d1,d2,d3} is used to describe the orientation of elemental cross sections. d3 is usually chosen to be perpendicular to the corresponding cross section and is not in the same direction with centerline tangent s ∂ r due to shear. b) Elements in MSRs with magnetization M are driven by magnetic torques τm to be aligned toward the direction of external magnetic field B. M is set to be relatively static with the triad {d1,d2,d3} so the direction of magnetization will move with the motion of elements. c) Elemental damping forces are modeled to be proportional to the difference of speed between nodes in order to eliminate vibration quickly and at the same time make as little influence on rigid body motion.

Figure 2. Components and procedures of the RL control system. a) Interaction between different parts of the TD3 algorithm. b) Training procedures for obtaining stable control policies. c) The simulation platform we use. In the picture, we annotate states and actions. In our work, we choose componentsof f ield increment ΔB as actions, and quantities involving necessary information of robot motion or magnetic field are defined as states. d) State variables consisting of angular orientation, angular velocity, vertical position, contact indicator, field angle, and field amplitude. Notice that state variables involving robot motion are defined element-wise or node-wise, so the fineness of segmentation will affect the total size of state variables. e) The structure of the actor network. The actor is used to map states to actions. f) The structure of the critic network. The critic is used to evaluate state-action pairs. g) Overall procedures of our RL control system. After training, we validate the stable control policies in simulations and, at the same time, obtain a set of magnetic f ields. The fields are then used to control the power supply via an STM32 control board. h) Helmholtz coils. For now, we use only two pairs of coils. i) Magnetization method of strip-like soft robots. MSRs presented in our cases are folded and put into a background field in two directions; therefore, robots with two different magnetization patterns are produced.

Figure 3. Learning results under relatively small field amplitude. Notice that the plots are from data in simulations. The dimensions of our robots are 20mm 8mm 0.8mm. a) Comparisons between simulation and experiment of the robot with magnetization 1. Directions of magnetic fields are indicated with a white arrow, and amplitudes of fields are annotated in number. The real soft robot reaches its maximum height about 0.3s faster than the simulated one because our dampingmodelleads tosomeoverdampingeffect onrotation, which leads to some mismatch in the comparisons, but overall moving gaits betweenthe two are in goodagreement.SeeMovieS1,SupportingInformation for more. b) Regularized height or span of the robot in 0–3s. Whenthespanreachesitsminimum, theheightreaches itsmaximum,andvice versa.c) Strengthof the front end and the backend contactingthe ground in 0–3s. Whentheheight of therobot reaches a maximum,the contactstrengthof the back end becomes largerthan the front end. d) Scatteredfield plots of applied to the robot in 0–10s. e) Comparisons between simulation and experiment of the robot with magnetization 2. See Movie S2, Supporting Information for more. f–h) The plots of the robot with magnetization 2 corresponding to (b–d). The learned moving gait is very similar to the one of magnetization 1. Two scattered plots of fields share similar contour shapes but have different orientations, which indicates similar torques among the two robots.

Figure 4. Comparisons between simulations and experiments of the robot with magnetization 1 under relatively large fields. Two different gaits are learned due to run-to-run variances of TD3. The experiments show good agreements with simulations with the exception of highly dynamic conditions at the beginning of gait 2 due to overdamping on rotations. See Movie S3 and S4, Supporting Information for more.

Figure 5. Q-value distributions for different actions estimated by different agents. We get these heat maps by inputting uniformly distributed actions to critics under certain states. The axes are in the unit of mT. In the plots, the points annotated by “Action Chosen” are the actual field increment chosen by actors. a) Q-value distributions when robots reach their highest point. Two agents trained under different magnetization patterns both choose different actions that correspond to points with a maximum Q-value. b) Q-value distributions when robots reach their lowest point. The action selected by the agent of magnetization 2 does not have the highest Q-value. c) Q-value distributions of agents corresponding to two gaits are shown in Figure 4. We choose the specific state to be at the starting position because it is shared by both gaits.

电磁铁

亥姆霍兹线圈

磁场线圈

永磁体

测试/实验系统

励磁电源

教学实验仪器

磁测量及应用产品

热门案例

应用场景

使用强化学习的磁性软机器人的自适应驱动

使用强化学习的磁性软机器人的自适应驱动